Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework

Zirui Wang; Jiateng Xie; Ruochen Xu; Yiming Yang; Graham Neubig; Jaime G. Carbonell

Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework

Zirui Wang, Jiateng Xie, Ruochen Xu, Yiming Yang, Graham Neubig, Jaime G. Carbonell

Keywords: transfer learning

Abstract Paper Code Reviews Chat

Mon Session 4 (17:00-19:00 GMT) [Live QA] [Cal]

Mon Session 5 (20:00-22:00 GMT) [Live QA] [Cal]

Abstract: Learning multilingual representations of text has proven a successful method for many cross-lingual transfer learning tasks. There are two main paradigms for learning such representations: (1) alignment, which maps different independently trained monolingual representations into a shared space, and (2) joint training, which directly learns unified multilingual representations using monolingual and cross-lingual objectives jointly. In this paper, we first conduct direct comparisons of representations learned using both of these methods across diverse cross-lingual tasks. Our empirical results reveal a set of pros and cons for both methods, and show that the relative performance of alignment versus joint training is task-dependent. Stemming from this analysis, we propose a simple and novel framework that combines these two previously mutually-exclusive approaches. Extensive experiments demonstrate that our proposed framework alleviates limitations of both approaches, and outperforms existing methods on the MUSE bilingual lexicon induction (BLI) benchmark. We further show that this framework can generalize to contextualized representations such as Multilingual BERT, and produces state-of-the-art results on the CoNLL cross-lingual NER benchmark.

Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework

Zirui Wang, Jiateng Xie, Ruochen Xu, Yiming Yang, Graham Neubig, Jaime G. Carbonell

Similar Papers

Cross-Lingual Ability of Multilingual BERT: An Empirical Study

Karthikeyan K, Zihan Wang, Stephen Mayhew, Dan Roth,

Massively Multilingual Sparse Word Representations

Gábor Berend,

Multilingual Alignment of Contextual Word Representations

Steven Cao, Nikita Kitaev, Dan Klein,

Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings

Shweta Mahajan, Iryna Gurevych, Stefan Roth,