Natural Language Processing - NAVER LABS Europe


Language technology to seamlessly communicate in an increasingly connected world: machine translation, natural language generation, natural language understanding, language modelling, document understanding, multilingual NLP, spoken language processing.



  • Member of BigScience, 1yr project on large language models
  • 2 papers at ACL 2021
  • Oral paper at ICLR 2021 on ‘A distributional approach to controlled text generation’ (on openreview),
  • Co-organising (Marc Dymetman, Hady Elsahar) the Energy Based Models workshop at ICLR 2021
  • Paper at EACL 2021, ‘Self-supervised and controlled multi-document opinion summarization’ (arXIv)
  • Hady Elsahar co-organizing the 2nd workshop on African NLP at EACL 2021
  • ALPS winter school in NLP (Jan 2021). All keynotes are online!
  • We continue teaching at CentraleSupelec.


  • We are co-organising the ALPS winter school in NLP (Jan 2021)
  • Meet us on the EMNLP virtual booth.
  • We have a paper at EMNLP,  at ‘Findings of EMNLP’ and 2 workshop papers
  • EU Horizon IMPACT award for the Transkribus READ project.
  • Salah Ait Mokhtar was recognized as an outstanding reviewer at ACL
  • We released a multi-lingual, multi-domain NMT model to catalyze research in content analysis during and after the Covid-19 crisis. Github, blog and paper
  • VitalRecords demo is live. Browse through the profession and deaths of people from the XIX century
  • We are teaching again at CentraleSupelec. Syllabus here

Related Content


Language is the most natural and most dominant mode of communication and, arguably, one of the main visible signals of higher intelligence. At the same time, language is messy, ambiguous, multimodal and ever-changing so to decipher it you need a good amount of common-sense, contextual and cultural understanding. To fulfil our vision of seamlessly communicating with intelligent devices, existing technology and the methods used to solve natural language processing problems need to be considerably improved.
That is precisely what we do:

  • As a European lab of a Korean company, we’re distinctly aware of how real the language barrier can be. We improve the current state-of-the-art in multilingual applications and machine translation, trying to find optimal tradeoffs between efficiency and performance.
  • In addition, while natural language generation (NLG) models have recently progressed to a point where they can produce highly fluent texts, they can be deficient on other important levels (producing toxic or socially biased content for instance) so we augment them with explicit controls.
  • As far as natural language understanding (NLU) is concerned, we address the challenge of capturing meaning beyond memorising surface patterns and co-occurrences. Our work on this topic applies to document understanding, fine-grained information extraction and spoken language understanding.
  • Method-wise, we’re particularly interested in how to combine the power and flexibility of deep neural networks with the rich prior knowledge present in decades of linguistic studies and knowledge of the task at hand. We also investigate how models can continuously and adaptively learn in order to incrementally acquire increasingly more complex skills and knowledge.
Papers and activities. Blog article by Matthias Gallé, Hady Elsahar, Quentin Grail, Jos Rozen and Julien Perez
Our Global BERT-based Transformer architecture fuses global and local information at every layer, resulting in a reading comprehension model that achieves a deeper understanding for long documents and enables flexibility for downstream tasks. Blog article by Quentin Grail and Julien Perez
podcast Matthias Galle
Podcast and transcript of Matthias Gallé, head of the NAVER research LAB in Europe who tells us what kind of research is going on in the labs in France and what it’s like to work there.
machine translation covid-19
Release of a multilingual, multi-domain NMT model for Covid-19 and biomedical research. (Blog)

More blog articles relevant to Natural Language Processing 

Natural Language Processing team:

Laurent Besacier
Group lead
Caroline Brun
Hervé Déjean
Marc Dymetman
Hady Elsahar
Zae Myung Kim
PhD candidate
Jos Rozen
Agnes Sandor