NATURAL LANGUAGE PROCESSING
Language is the most natural and dominant mode of communication. We create technology to seamlessly communicate with devices in an ambient intelligent world.
- 2 papers accepted at ICDAR, on using graph-CRF and GCN for table & document understanding
- Machine Translation: our entry at the robustness track of WMT won across all language pairs. Results here
- Like old-fashioned & robust formal models as well as fancy neural nets? You can have it both! Consider the workshop Deep Learning and Formal Languages: Building Bridges we are co-organizing at ACL.
- We are co-organizing the Table Understanding competition at ICDAR
- We are teaching again at CentraleSupelec. Syllabus here
- Open sourcing: Collective Intelligence Centre, Univ. Technology, Sydney releases open source Academic Writing Analytics infrastructure collaborating with Ágnes Sándor and Claude Roux. Contributions to AD3.
- Ágnes Sándor appointed honorary associate at the University of Technology, Sydney
- Reviewer: NAACL oustanding reviewer (Marc Dymetman) & EMNLP Best Reviewer Award (Matthias Gallé)
- SMOOTH GDPR H2020 project accepted and kicked off in May
- The team is giving an NLP course at Centrale Supelec
- In collaboration with our colleagues from CV, we ranked first at the MediaEval 2017 Retrieving Diverse Social Images Task challenge.
- AI/data-mining papers: full papers at UAI, RO-MAN (conversational agents), IJCAI (Q&A), KDD (real-time bidding) and AISTATS (spectral methods).
- Best Full Paper at the 7th International Learning Analytics and Knowledge Conference (LAK’17), ‘Reflective Writing Analytics for Actionable Feedback‘
- 3 papers and a demo at EACL in Valencia
Language is the most natural and dominant mode of communication, and arguably one of the main visible signals of higher intelligence. At the same time, language is messy, ambiguous and ever-changing so to decipher it you need a good amount of cultural, common-sense and contextual understanding. To fulfill our vision of Ambient Intelligence where intelligent devices communicate seamlessly with us, we need to considerably improve existing technology and methods that solve NLP problems. That’s exactly what we do.
Natural Language Understanding
We address what’s often called the “Natural Language Understanding” part, by going beyond simple named entity extraction to get the real meaning of user-generated-content, both the objective part as well as the subjective one. We match our understanding of the textual item to our understanding of the needs of the human to provide the right textual item at the right time.
In addition to work on machine translation, focusing particularly on the robustness of those models, we also tackle other natural language generation applications such as summarization.
As a European lab of a Korean company we’re distinctly aware of how real the language barrier can be, and we improve the current state-of-the-art in multilingual applications and machine translation.
Method-wise, we’re particularly interested in how to combine the power and flexibility of deep neural networks with the rich prior knowledge present in decades of linguistic studies and prior knowledge of the task at hand. This gives us better results with less training data.
Sensitive to the tension between our data-hungry algorithms and the importance of protecting privacy we develop privacy-preserving data-mining techniques.
[MAFIA MEETUP#1] CONVERSATIONAL ROBOTS BY MATTHIAS GALLÉ