4th October 2013 – Article written at the occasion of the XRCE 20th anniversary celebration.
PDF version of the article
In the late 1880s a Russian-born doctor Leyzer Leyvi Zamengov created an easy-to-learn, politically neutral language aimed at transcending nationality and fostering peace between people with different languages. Dubbed “Esperanto”, the artificial language still has a significant presence in over 100 countries today with estimates of fluent Esperanto speakers ranging from 10,000 to 2 million worldwide[1].
The idea of communication via artificial language by Zamengov and others, today is known as constructed languages or “conlangs”. Hollywood and the film industry in general is particularly fond of conlangs, which offer a quick way of instilling a certain kind of exoticism in even the most run of the mill scenarios. What do the box office hits Star Trek, Game of thrones and Avatar have in common? They all have their own conlang: the Klingon, Dothraki or Na’vi languages, which were specifically designed to enrich their different universes with a touch of the strangely unknown. The sounds, vowels and consonants of these languages were chosen exactly as you would choose costumes and makeup to convey the brutality or the softness of these imaginary worlds. These constructed languages are in fact languages in the proper sense of the term, in which dialogues are written, to which linguistics theories apply, albeit without the complexity of our natural languages, since these languages can be described in extenso. This inherent factor means that conlangs are also an efficient way to help computers translate text into multiple languages – on the fly.
The challenges of machine learning
Despite 60 years of research in computational linguistics, computers still fail to grasp the meaning of the most simple of sentences and ambiguity still plagues the most refined linguistic theories. We human beings solve ambiguities with our understanding of the world, with our capacity to put utterances back into their context. In contrast, the most advanced techniques such as machine learning, can only count words or phrases, relying on co-occurrences and word distances to make sense of texts and documents.
For instance, a very simple sentence such as “The dogs are loud” will be correctly translated into French, by a well known translation site, as “Les chiens sont bruyants”. However, if the sentence is modified into “The dogs are way too loud”, then the same site yields “Les chiens sont beaucoup trop fort”, which means “The dogs are way too strong”: a quite surprising semantic drift from the original English sentence. Furthermore, the agreement between “fort” and “chiens” is lost in translation. Thus, even the most advanced translation systems can be easily disrupted with a few modifications. This is because texts are ambiguous, terribly ambiguous. Natural languages have evolved in a rather organic way, without any actual plan, hence this inextricable fabric, with which our computer programs have so many difficulties. At each step, syntactic analyzers or parsers are confronted with ambiguous words and ambiguous constructions. The combinatorial aspect of natural language is such that a computer program might end up with thousands of potential analyses for a regular sentence of twenty words. Machine Learning techniques have tried to reduce this complexity, weighting words and constructions to feed complex classifiers, but ambiguity cannot always be reduced to correlation. Parsers are dumb, they analyze sentences one after the other without keeping track of past analyses. What we need is a non-ambiguous representation which could be used not only to store previous analyses, but also data from the real world, an intermediary structure which would be close to a human language but would have the properties of a computer language: a constructed natural language that could be compiled as a program, in other words a conlang. John McCarthy, the man who coined the term “Artificial Intelligence”, had this idea back in 1976[2], when he proposed to solve Natural Language Processing issues with what he called Artificial Natural Languages, another name for conlang.
Lingvata is one of these languages, designed to be free of all forms of ambiguity. An artificial language that can be used as an intermediary step for computers to translate, for instance, from one language into any other. Lingvata uses suffixes, which encompass one single part of speech, to avoid category ambiguities as in “drink” which can be either a noun or a verb. Lingvata provides a unique ending for nouns, pronouns, adjectives, verbs, prepositions, determiners and adverbs, eliminating the risk of ambiguous interpretations and errors. Words are simply created by combining a semantic root[3] with one of these suffixes.
For instance, the root “parole” is related to speech.
Thanks to this simple mechanism, Lingvata can be enriched with as many words as necessary, without introducing any homonyms or too many synonyms.
Lingvata also provides a mechanism to avoid syntactic ambiguity, based on Latin as a model. In Latin, the role of the different words in a sentence is governed by their suffixes or case markers. For instance, the sentence: “domina rosam amat”, means “the lady loves the rose.” The termination “am” indicates which element in the sentence is the direct object. If you shuffle the suffixes, you also change the sense as in “dominam rosa amat”, which now means: “the rose loves the lady”. This is a very efficient way to encode syntax, as each case marker conveys only one possible syntactic interpretation. Thus, Lingvata has case markers to indicate not only a direct object or an indirect object as in Latin, but also specific combinations to encode verb and noun complements. These follow a strict word order to make the whole syntactic process totally deterministic[4], for instance, the verb is always at the end of the sentence. Lingvata offers four different terminations for case marking (vs. six in Latin), which are shared by all categories:
Our previous Latin example “the lady loves the rose” would then translate as: “Dameta rosetan ameiag”.
The grammar also provides mechanisms to handle clauses and conjunctions, but most of the sentences rely on the few above rules. As an experiment, we wrote a little text in Lingvata and checked if we could automatically translate it into French and English. Thanks to the simplicity of the grammar, the analysis of a Lingvata sentence is straightforward. It requires less than 50 rules to cover all aspects of the language, which enables us to translate each sentence into French or into English in less than a few milliseconds on a basic computer. In comparison, the English grammar comprises more than 3000 rules. We have also developed a tool that takes as input a sentence in French and translates it into Lingvata. We can then fix the errors in the Lingvata output, since we know that the French analysis is not always reliable, and store the results in a file, which can then be used to translate into any language for which we have a generator. This could be used for example for web site content in multiple languages where the original text would be in Lingvata. The system would then translate the content on the fly into the user’s language, removing the necessity to maintain as many versions of the text as there are languages.
In a certain way, the most compact way to store the semantic representation of a text is…the text itself. For a long time, linguists have tried to formalize languages into strict mathematical frameworks, but languages have proved to be so elusive that most theories “leak” – that may be why we call them natural. Machine learning techniques, despite their careful injection of hard science into the problem, did bring some improvement, but the best systems still fail to provide a precise and reliable analysis for too many cases. Today, with the advent of the internet, textual information is everywhere. Yet, the ambiguity and complexity of natural languages makes it quite difficult to draw on these resources in an efficient manner. On the contrary, an Artificial Natural Language representation keeps the whole spectrum of linguistic data intact with very little or no loss of information. A paragraph or a sentence written in a conlang is a description as precise as any piece of text and at the same time the semantic encoding of that text: a symbolic representation which sits half-way between man and machine.
[1] https://en.wikipedia.org/wiki/Esperanto[2] McCarthy J. (1976). An example for natural language understanding and the AI problems it raises. Formalizing Common Sense: Papers by John McCarthy. Ablex Publishing Corporation, 355.[3] The Esperanto language was the source of many of the semantic roots that we use in Lingvata. Since most of these roots have already been translated in many natural languages, it proved the most efficient way to bootstrap our own implementation of the Lingvata language.[4] Word order, case markers and terminations in Lingvata are of course arbitrary. One could design a completely different grammar that would still retain the same properties. However, if designing a conlang is actually fun, it requires quite a lot of work and experiments to achieve the right balance between simplicity, conciseness and expressiveness.
About the author: Claude Roux received his Ph.D. in syntactic parsing algorithms from the Université de Montréal (Canada) in 1996. This work was the basis of the Xerox Incremental Parser (XIP) which can be accessed on Xerox’s virtual lab Open Xerox. His main interest lies in syntactic parsing and formal language theories. He is the creator of the Lingvata conlang.
NAVER LABS Europe 6-8 chemin de Maupertuis 38240 Meylan France Contact
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimization problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimization to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments. More details on our research can be found in the Explore section below.
For a robot to be useful it must be able to represent its knowledge of the world, share what it learns and interact with other agents, in particular humans. Our research combines expertise in human-robot interaction, natural language processing, speech, information retrieval, data management and low code/no code programming to build AI components that will help next-generation robots perform complex real-world tasks. These components will help robots interact safely with humans and their physical environment, other robots and systems, represent and update their world knowledge and share it with the rest of the fleet. More details on our research can be found in the Explore section below.
Visual perception is a necessary part of any intelligent system that is meant to interact with the world. Robots need to perceive the structure, the objects, and people in their environment to better understand the world and perform the tasks they are assigned. Our research combines expertise in visual representation learning, self-supervised learning and human behaviour understanding to build AI components that help robots understand and navigate in their 3D environment, detect and interact with surrounding objects and people and continuously adapt themselves when deployed in new environments. More details on our research can be found in the Explore section below.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
1. Difference in female/male salary: 34/40 points
2. Difference in salary increases female/male: 35/35 points
3. Salary increases upon return from maternity leave: Non calculable
4. Number of employees in under-represented gender in 10 highest salaries: 5/10 points
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
1. Les écarts de salaire entre les femmes et les hommes: 34 sur 40 points
2. Les écarts des augmentations individuelles entre les femmes et les hommes : 35 sur 35 points
3. Toutes les salariées augmentées revenant de congé maternité : Incalculable
4. Le nombre de salarié du sexe sous-représenté parmi les 10 plus hautes rémunérations : 5 sur 10 points
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimisation problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimisation to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments.
The research we conduct on expressive visual representations is applicable to visual search, object detection, image classification and the automatic extraction of 3D human poses and shapes that can be used for human behavior understanding and prediction, human-robot interaction or even avatar animation. We also extract 3D information from images that can be used for intelligent robot navigation, augmented reality and the 3D reconstruction of objects, buildings or even entire cities.
Our work covers the spectrum from unsupervised to supervised approaches, and from very deep architectures to very compact ones. We’re excited about the promise of big data to bring big performance gains to our algorithms but also passionate about the challenge of working in data-scarce and low-power scenarios.
Furthermore, we believe that a modern computer vision system needs to be able to continuously adapt itself to its environment and to improve itself via lifelong learning. Our driving goal is to use our research to deliver embodied intelligence to our users in robotics, autonomous driving, via phone cameras and any other visual means to reach people wherever they may be.
This web site uses cookies for the site search, to display videos and for aggregate site analytics.
Learn more about these cookies in our privacy notice.
You may choose which kind of cookies you allow when visiting this website. Click on "Save cookie settings" to apply your choice.
FunctionalThis website uses functional cookies which are required for the search function to work and to apply for jobs and internships.
AnalyticalOur website uses analytical cookies to make it possible to analyse our website and optimize its usability.
Social mediaOur website places social media cookies to show YouTube and Vimeo videos. Cookies placed by these sites may track your personal data.
This content is currently blocked. To view the content please either 'Accept social media cookies' or 'Accept all cookies'.
For more information on cookies see our privacy notice.