NATURAL LANGUAGE PROCESSING
Language is the most natural and dominant mode of communication. We create technology to seamlessly communicate with devices in an ambient intelligent world.
Highlights
2020
- We are co-organising the ALPS winter school in NLP (Jan 2021)
- Meet us on the EMNLP virtual booth.
- We have a paper at EMNLP, at ‘Findings of EMNLP’ and 2 workshop papers
- EU Horizon IMPACT award for the Transkribus READ project.
- Salah Ait Mokhtar was recognized as an outstanding reviewer at ACL
- We released a multi-lingual, multi-domain NMT model to catalyze research in content analysis during and after the Covid-19 crisis. Github, blog and paper
- VitalRecords demo is live. Browse through the profession and deaths of people from the XIX century
- We are teaching again at CentraleSupelec. Syllabus here
2019
- Matthias was recognized as one of the Top Reviewers at NeurIPS.
- NLP team at EMNLP: we will be presenting 2 papers at the main conference, 1 at CoNLL and 4 workshop papers
- The COST action Multi3Generation was accepted: this is an European networking tool for researchers interested in (text) generation
- 2 papers accepted at ICDAR, on using graph-CRF and GCN for table & document understanding
- Machine Translation: our entry at the robustness track of WMT won across all language pairs. Results here
- Like old-fashioned & robust formal models as well as fancy neural nets? You can have it both! Consider the workshop Deep Learning and Formal Languages: Building Bridges we are co-organizing at ACL.
- We are co-organizing the Table Understanding competition at ICDAR
- We are teaching again at CentraleSupelec. Syllabus here
Related Content

Language is the most natural and dominant mode of communication, and arguably one of the main visible signals of higher intelligence. At the same time, language is messy, ambiguous and ever-changing so to decipher it you need a good amount of cultural, common-sense and contextual understanding. To fulfill our vision of Ambient Intelligence where intelligent devices communicate seamlessly with us, we need to considerably improve existing technology and methods that solve NLP problems. That’s exactly what we do.
Natural Language Understanding
We address what’s often called the “Natural Language Understanding” part, by going beyond simple named entity extraction to get the real meaning of user-generated-content, both the objective part as well as the subjective one. We match our understanding of the textual item to our understanding of the needs of the human to provide the right textual item at the right time.
In addition to work on machine translation, focusing particularly on the robustness of those models, we also tackle other natural language generation applications such as summarization.
As a European lab of a Korean company we’re distinctly aware of how real the language barrier can be, and we improve the current state-of-the-art in multilingual applications and machine translation.
Method-wise, we’re particularly interested in how to combine the power and flexibility of deep neural networks with the rich prior knowledge present in decades of linguistic studies and prior knowledge of the task at hand. This gives us better results with less training data.
Sensitive to the tension between our data-hungry algorithms and the importance of protecting privacy we develop privacy-preserving data-mining techniques.
Natural Language Processing team: