Alexandre Berard, Matthias Gallé, Zae Myung Kim, Vassilina Nikoulina, Lucy Park |
2020 |
You’ve heard it before: Covid-19 is disrupting every aspect of our lives. Different forms of lockdown are in operation across the world, and we’re all coming terms with the idea that social distancing may be a long-term requirement. It’s likely you’ve also heard that this pandemic is far from over, and that its consequences will take years to fully crystallise.
In the midst of all this, chances are high that when you open a news portal or social media channel, you’re presented with an article or post that has some reference to the disease. By its very nature, a pandemic affects people of all languages, and we’re currently living through a rare period in which people around the world are all talking about a single topic. Understanding how reactions vary across different cultures, as well as pulling them together to find commonalities, will provide important insights for the future.
We believe that the vast sum of written digital communication about Covid-19 that’s currently being amassed will be the basis of hundreds of research programmes in the future. The data will be used to analyse our response during this period and—hopefully—to orient and advise future policies in economy, sociology, crisis management and, of course, public health.
To facilitate the large-scale analysis of this digital evidence at such a unique time in human history, we are releasing a multilingual translation model that can translate biomedical text. Anybody can download our model and use it locally to translate texts from five different languages (French, German, Italian, Spanish and Korean) into English.
While automated translation portals are mainstream and used by millions daily, they’re not specialised in biomedical data. Such data often contains specific terminology which isn’t recognised—or is poorly translated—by most platforms. In addition, making our work available means that researchers can host their own models, enabling them to translate at will without having to monitor the budget spent on those portals. Although a few pretrained models exist (Opus-MT-train is one example), most are only bilingual, limiting their use. Additionally, as they aren’t trained using biomedical data, the models are not suitable for such specialised translation.
Neural machine translation models work by encoding input sequences into mathematical structures, or intermediate representations, that consist of points in high-dimensional real space. These intermediate representations are obtained by setting a large number of parameters, which we achieved by exposing our model to training data (consisting of translated sentences from publicly available resources). A decoder then uses the intermediate representation to generate an English translation, producing the translated sentence word by word.
A variety of neural architectures exist. We based our own work on state-of-the-art models, which employ the so-called Transformer (1) architecture. Using high-capacity variants of this architecture, we’re able to translate different languages with a single model.
Typically, a number of models (n2, where n is the number of languages) must be managed separately to enable translation across multiple languages. Because ours is multilingual, users are able to translate from five different languages to English using just one model, simplifying storage and maintenance. More importantly, research has shown (2) that multilingual models can greatly benefit so-called under-resourced languages, i.e. languages for which less parallel data exists (as can be seen by the data we used, the number of training sentences varies widely across languages). In the benchmarks that we used to measure performance, we found that our model achieves results similar to the best-performing bilingual models.
Words are often ambiguous when taken out of context and can mean very different things in different settings (or ‘domains’). For example, when translated into German, high temperature could be hohe Temperatur or Fieber, depending on whether the domain is meteorology or medicine, respectively. Likewise, a French carte might be a map or a menu, depending on whether you’re on a treasure hunt or in a restaurant. For this reason, creating a multi-domain model—i.e. one that is capable of translating specialised information—is particularly challenging. To achieve multi-domain functionality and enable the accurate translation of information relating to Covid-19, we used a variety of parallel biomedical data (in particular from TAUS) when training our model.
One approach to achieving domain adaptation in a translation model is to fine-tune it to the specific domain of interest. We didn’t want to overspecialise, however, as this could lead to losing the advantage provided by large corpora from other domains. To maximise the usability of our model, we decided instead to use domain tags (a strategy that has been successful (3) in the past). These domain tags are used as control tokens. During training, sentences that come from one domain are assigned the same tag. The tag can then be used at inference time (as opposed to training time) to nudge the model towards one domain or the other. In the model we’re releasing, the user can employ the default settings, for standard translation, or select the biomedical tag. The same sentence translated with or without this token produces different output.
We achieved multilingualism in the same way. For example, a French sentence is tagged with a different label than a Korean sentence: <fr> and <ko>, respectively. Although controlling the language tags at inference time makes less sense (as it forces the model to consider, for instance, a German sentence when translating from Italian to English), it allows for very flexible and generic training procedures.
Indeed, by enabling two ways of varying the input (domain and language), our model achieves better translation, as determined by a standard measure (called BLEU, for bilingual evaluation understudy) that counts overlapping sequences of words with respect to reference translation. Interestingly, although the model wasn’t presented with any biomedical data for Korean–English at the learning stage, we found that using the biomedical tag for translation resulted in a different output. In an internal test, we translated a set of biomedical texts and observed an increase of 0.44 BLEU points when the tag was selected. Although the changes are small, they’re often important (see, for example, the changes highlighted with boldface in Table 1). The first two examples in Table 1 show cases for which the translation was more accurate, while the last example actually shows degraded performance.
We’re currently further exploring the transfer of such information in multilingual and multi-domain models. We’ve already proven that the use of flexible control tokens makes summarisation more faithful (4) to the original documents, and we are interested in how the interplay between different types of control tokens affects translation. In particular, the transfer of knowledge across languages and domains could open up a wide range of exciting uses for natural language generation models, even for languages or domains where no training data is easily available.
We tested our model against competing machine translation models using the BLEU measure. BLEU is a quality metric score for machine translation systems that evaluates the quality of a piece of machine-translated text by comparing it to corresponding human-translated text. Table 2 reports the BLEU values obtained for standard benchmarks in the field of machine translation, which are provided in regular competitions, compared with our own. Whenever available, we also compare against the best-performing model (as reported in the corresponding competition).
[a] newstest2019.de-en, newstest2014.fr-en, newstest2013.es-en.
[b] Test sets (to English) from the WMT18 and WMT19 biomedical translation tasks. Results obtained using the <medical> tag.
[c] IWSLT17-test (for all but Spanish) and IWSLT16-test (for Spanish).
Notes:
Note that slightly better results can be obtained by using ensemble models, but for simplicity of use we are releasing our single best model.
Detailed instructions are here. You will need a local copy of the fairseq toolkit. The version of our translation model that we’re releasing also requires a minimal amount of additional code (which we’re also releasing) that you’ll need to add in order to start translating. This additional code takes care of preprocessing the sentences you want to translate: while we rely on standard tools for tokenisation (SentencePiece), we are adding a necessary script through which the input data must be passed.
Other than that, just follow the instructions and… happy translating!
References
* Some numbers were updated to match the evaluation to the exact same conditions used in the referent competitions. Latest, updated figures on GitHub.
NAVER LABS Europe 6-8 chemin de Maupertuis 38240 Meylan France Contact
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimization problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimization to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments. More details on our research can be found in the Explore section below.
For a robot to be useful it must be able to represent its knowledge of the world, share what it learns and interact with other agents, in particular humans. Our research combines expertise in human-robot interaction, natural language processing, speech, information retrieval, data management and low code/no code programming to build AI components that will help next-generation robots perform complex real-world tasks. These components will help robots interact safely with humans and their physical environment, other robots and systems, represent and update their world knowledge and share it with the rest of the fleet. More details on our research can be found in the Explore section below.
Visual perception is a necessary part of any intelligent system that is meant to interact with the world. Robots need to perceive the structure, the objects, and people in their environment to better understand the world and perform the tasks they are assigned. Our research combines expertise in visual representation learning, self-supervised learning and human behaviour understanding to build AI components that help robots understand and navigate in their 3D environment, detect and interact with surrounding objects and people and continuously adapt themselves when deployed in new environments. More details on our research can be found in the Explore section below.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
—————
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
1. Difference in female/male salary: 34/40 points
2. Difference in salary increases female/male: 35/35 points
3. Salary increases upon return from maternity leave: Non calculable
4. Number of employees in under-represented gender in 10 highest salaries: 5/10 points
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
——————-
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
1. Les écarts de salaire entre les femmes et les hommes: 34 sur 40 points
2. Les écarts des augmentations individuelles entre les femmes et les hommes : 35 sur 35 points
3. Toutes les salariées augmentées revenant de congé maternité : Incalculable
4. Le nombre de salarié du sexe sous-représenté parmi les 10 plus hautes rémunérations : 5 sur 10 points
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimisation problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimisation to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments.
The research we conduct on expressive visual representations is applicable to visual search, object detection, image classification and the automatic extraction of 3D human poses and shapes that can be used for human behavior understanding and prediction, human-robot interaction or even avatar animation. We also extract 3D information from images that can be used for intelligent robot navigation, augmented reality and the 3D reconstruction of objects, buildings or even entire cities.
Our work covers the spectrum from unsupervised to supervised approaches, and from very deep architectures to very compact ones. We’re excited about the promise of big data to bring big performance gains to our algorithms but also passionate about the challenge of working in data-scarce and low-power scenarios.
Furthermore, we believe that a modern computer vision system needs to be able to continuously adapt itself to its environment and to improve itself via lifelong learning. Our driving goal is to use our research to deliver embodied intelligence to our users in robotics, autonomous driving, via phone cameras and any other visual means to reach people wherever they may be.
This web site uses cookies for the site search, to display videos and for aggregate site analytics.
Learn more about these cookies in our privacy notice.
You may choose which kind of cookies you allow when visiting this website. Click on "Save cookie settings" to apply your choice.
FunctionalThis website uses functional cookies which are required for the search function to work and to apply for jobs and internships.
AnalyticalOur website uses analytical cookies to make it possible to analyse our website and optimize its usability.
Social mediaOur website places social media cookies to show YouTube and Vimeo videos. Cookies placed by these sites may track your personal data.
This content is currently blocked. To view the content please either 'Accept social media cookies' or 'Accept all cookies'.
For more information on cookies see our privacy notice.