As a researcher and practitioner of machine learning since the 80’s I have a confession to make. I was one of those who often dreamt about the day robots would relieve us of all the monotonous cognitive processing we have to go through before we get to the end of the day. We would spend our time focussing on grander thoughts, making the world a better place thanks to our boundless creativity. As the years passed and one machine learning project led to another I had to admit that we had virtually never replaced a human mind in the way I had imagined. In customer care, compliance monitoring or even e-discovery in litigation – you name it – we designed robots, yes, but never ones that took decisions and carried them out all by themselves. What we did do however, was provide real tangible support to the people making decisions. My dream has come true – just not in the way I had imagined.
Acknowledging that the Artificial Intelligence engines we create shouldn’t work in isolation but instead, work with Human Intelligence, should be considered a strength not a weakness. Where the challenge lies is how these man-machine collaborative systems communicate to be successful. They each need to express their needs and expectations. They also need to guide each other. This is my new dream and one of my favourite research goals.
My team and I put most of our energy into designing and implementing machine-learning based systems that improve the productivity and performance of our business solutions. Our common goal is often to retrieve and organize information that lies in large, unstructured collections of documents, or to propose personalised recommendations in time-varying environments. These are environments where the users have changing information needs and where the objects themselves (documents, movies, communities of experts, …) come and go or may be perceived differently over time. The systems are integrated and rolled out in decision support platforms and workflows. One example is a technology-assisted review (TAR) platform for litigation and compliance monitoring. These platforms typically process millions of documents (emails, memos, plans, …) in preparation for a corporate litigation case and help lawyers or paralegals to review them. Another example is to improve product recommendations of a global DVD reseller web site.
Our tools propose things to humans. They are the ones to decide if they want to accept or reject the offers. This distance between the algorithms and final action is an important one to keep. In the fields of e-discovery, compliance monitoring and healthcare, the outcomes could have such huge impact that it’s wise, and sometimes even compulsory, to have the human in the loop. It would be simply absurd to try and force a customer to buy a product. The inherent lack of confidence humans have in models and algorithms is another reason. People naturally want to understand what’s “inside the box” to increase their confidence in the results, especially when conditions are difficult. Difficult conditions typically occur when the input data is far from the kind used to build the model or even if they are outliers. Yet another, more subtle reason to prevent the algorithms from having the final word, is the time lapse that occurs between the moment the user expresses their needs (i.e. what they’re looking for, in the form of labelled examples to train our machine learning models), and the moment the model is applied to real data. There is some kind of “concept drift” phenomenon. The lapse is usually pretty long, with little interaction between our team – who manages and controls the model and the end user. So, in the end, the tool doesn’t do what the user had expected. What they wanted to retrieve and the initial training examples they selected are often incomplete, imprecise and subject to change when confronted with the outcome. To complicate things even more, often the user doesn’t know exactly what they’re looking for. That finding comes from interacting with the tools and, in the case of e-discovery, taking time to explore the document collection. In a nutshell, the ideal situation is to allow users to interact with the underlying machine learning algorithms in a mutually enriching and “agile” way with no intermediary. Up until recently this was practically impossible, but things are changing.
We undertook the first step in trying to realise this kind of interaction 5 years ago in the TAR document review for litigation. Traditionally, the attorney or paralegal has to give a binary label to each document under review in the TAR platform (“responsive” vs “non-responsive”). It quickly became clear to us we would get much better machine-learning based classifiers if we could allow the expert user to enter “directional” and “graded” labels. This means that a document that was not-responsive for the case could be graded as “nearly responsive” or “going in direction of responsiveness”. Or between two non-responsive documents, one could be graded as “a bit more in the responsive direction” than the other. We even allowed conflicting labels when multiple users reviewed the same documents with differing opinions. By adapting the training algorithm to take the conflicting labels into account, as well as the level of expertise of the labellers, the performance of the classifier increased significantly.
After this, we extended the ways subject matter experts could introduce other types of prior knowledge or even finer-grained labels. A typical example is to allow the user to express “queries” as if they ‘re looking for relevant documents in a collection through a regular search engine. We then use these queries to teach the classifier. We do this by simply using the relevance score of each document in the collection with respect to the query, as a highly discriminative feature of the document. This extra feature is then added in an appropriate way to the other features of the document when using the classifier, at training and test times. In a similar vein, the subject matter expert can introduce a list of terms they consider representative of relevant (or non-relevant) documents. The opposite is possible too, where the learning algorithm proposes a list of terms with their “polarity” (relevant vs. non-relevant) and the user confirms or rejects the proposal. The opportunity to highlight text to show what makes a document relevant or not (fine-grain annotation) is another way to improve the machine learning algorithm. In return, at test time, the classifier partly motivates its classification decision by highlighting the passages that contribute most to its final decision. This helps the user understand the underlying model in a simplistic way.
Confidence, like trust, comes from understanding so it’s difficult to guide a machine learning system if you don’t know what it’ll do with your guidance. We recently worked on associating a movie recommender system with profiles extracted from users’ past behaviour and the movie comments of users with the same profile. The recommendation algorithm basically analyses what the compatibility points are between the implicit preference model of the user and the features (both explicit and implicit) of the proposed items, and associates the most relevant terms of these compatibility points extracted from comments of users with a similar preference model. These compatibility points are actually automatically extracted from the users’ ratings through a method known as matrix factorisation; they correspond to latent factors that could be a posteriori interpreted as attraction/repulsion towards movie genre, movie time period, groups of actors, etc. This helps the user understand the context and the value of the proposal and, eventually, to accept the recommendation.
A final example of where we provide users the rationale behind a prediction is email routing in customer contact centres. By performing a “what-if” analysis we can locally mimic the behaviour of our complex machine learning algorithm and display, in natural language, the rules that resulted in the routing applied.
Bi-directional feedback between humans and machines is important to increase confidence of users and is a useful way to improve performance. One of the most important factors for success is the design of the graphical user interface that translates and adapts these algorithmic mechanisms into efficient and effective communication tools. A multi-disciplinary approach is required that takes into account ergonomic, cognitive and psychological factors. Our collaborative touch-based table, DISCO, is a good example of such an implementation. You can learn more about it here. It will be the subject of the next blog post.
Further reading:
NAVER LABS Europe 6-8 chemin de Maupertuis 38240 Meylan France Contact
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimization problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimization to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments. More details on our research can be found in the Explore section below.
For a robot to be useful it must be able to represent its knowledge of the world, share what it learns and interact with other agents, in particular humans. Our research combines expertise in human-robot interaction, natural language processing, speech, information retrieval, data management and low code/no code programming to build AI components that will help next-generation robots perform complex real-world tasks. These components will help robots interact safely with humans and their physical environment, other robots and systems, represent and update their world knowledge and share it with the rest of the fleet. More details on our research can be found in the Explore section below.
Visual perception is a necessary part of any intelligent system that is meant to interact with the world. Robots need to perceive the structure, the objects, and people in their environment to better understand the world and perform the tasks they are assigned. Our research combines expertise in visual representation learning, self-supervised learning and human behaviour understanding to build AI components that help robots understand and navigate in their 3D environment, detect and interact with surrounding objects and people and continuously adapt themselves when deployed in new environments. More details on our research can be found in the Explore section below.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
—————
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
Details on the gender equality index score 2024 (related to year 2023) for NAVER France of 87/100.
1. Difference in female/male salary: 34/40 points
2. Difference in salary increases female/male: 35/35 points
3. Salary increases upon return from maternity leave: Non calculable
4. Number of employees in under-represented gender in 10 highest salaries: 5/10 points
The NAVER France targets set in 2022 (Indicator n°1: +2 points in 2024 and Indicator n°4: +5 points in 2025) have been achieved.
——————-
Index NAVER France de l’égalité professionnelle entre les femmes et les hommes pour l’année 2024 au titre des données 2023 : 87/100
Détail des indicateurs :
1. Les écarts de salaire entre les femmes et les hommes: 34 sur 40 points
2. Les écarts des augmentations individuelles entre les femmes et les hommes : 35 sur 35 points
3. Toutes les salariées augmentées revenant de congé maternité : Incalculable
4. Le nombre de salarié du sexe sous-représenté parmi les 10 plus hautes rémunérations : 5 sur 10 points
Les objectifs de progression de l’Index définis en 2022 (Indicateur n°1 : +2 points en 2024 et Indicateur n°4 : +5 points en 2025) ont été atteints.
To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimisation problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimisation to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments.
The research we conduct on expressive visual representations is applicable to visual search, object detection, image classification and the automatic extraction of 3D human poses and shapes that can be used for human behavior understanding and prediction, human-robot interaction or even avatar animation. We also extract 3D information from images that can be used for intelligent robot navigation, augmented reality and the 3D reconstruction of objects, buildings or even entire cities.
Our work covers the spectrum from unsupervised to supervised approaches, and from very deep architectures to very compact ones. We’re excited about the promise of big data to bring big performance gains to our algorithms but also passionate about the challenge of working in data-scarce and low-power scenarios.
Furthermore, we believe that a modern computer vision system needs to be able to continuously adapt itself to its environment and to improve itself via lifelong learning. Our driving goal is to use our research to deliver embodied intelligence to our users in robotics, autonomous driving, via phone cameras and any other visual means to reach people wherever they may be.
This web site uses cookies for the site search, to display videos and for aggregate site analytics.
Learn more about these cookies in our privacy notice.
You may choose which kind of cookies you allow when visiting this website. Click on "Save cookie settings" to apply your choice.
FunctionalThis website uses functional cookies which are required for the search function to work and to apply for jobs and internships.
AnalyticalOur website uses analytical cookies to make it possible to analyse our website and optimize its usability.
Social mediaOur website places social media cookies to show YouTube and Vimeo videos. Cookies placed by these sites may track your personal data.
This content is currently blocked. To view the content please either 'Accept social media cookies' or 'Accept all cookies'.
For more information on cookies see our privacy notice.