Naverlabs Europe worked on Robustness problems related to User Generated Content translation , especially with our participation at the WMT 2019 1st Robustness Task  where we ranked first in 3 of the 4 subtasks.
In this internship we would like to continue exploring the robustness challenges for MT in a multimodal setting. MT models deployed in production are currently working with input text coming from various modalities like the output from an OCR, for picture translation, or the output of an ASR, for speech translation. We would like to investigate how to build a NMT model with the same level of quality on clean text, UGC text, OCR output and ASR decoding output.
The intern will work in a team composed of NMT, OCR and Speech experts, who will collaborate on building the model. The intern’s will have to help on building the multimodal corpus and help define the experimental settings. He or she will have to conceptualize, with the team, the solutions to this new challenge and implement them.
NAVER LABS Europe has full-time positions, PhD and PostDoc opportunities throughout the year which are advertised here and on international conference sites that we sponsor such as CVPR, ICCV, ICML, NeurIPS, EMNLP etc.
NAVER LABS Europe is an equal opportunity employer.
NAVER LABS are in Grenoble in the French Alps. We have a multi and interdisciplinary approach to research with scientists in machine learning, computer vision, artificial intelligence, natural language processing, ethnography and UX working together to create next generation ambient intelligence technology and services that deeply understand users and their contexts.