RESEARCH

Making robots part of everyday life

AI for robotics research at NAVER LABS Europe is driven by the ambition to build foundation models (FMs) capable of powering versatile, real-world robotic systems. These models are conceived to generalize across diverse tasks and environments, enabling robots to seamlessly interact, navigate and manipulate. The approach is structured around three complementary axes: (1) developing new architectures that can effectively learn and transfer skills, (2) creating training regimes that exploit synergies between tasks and (3) devising evaluation protocols that measure performance in realistic, dynamic settings.

Progress along these axes draws on multidisciplinary expertise, combining deep learning research, robotic control, computer vision and natural language understanding. This integrated skill set allows us to tackle both perception and action, ensuring that FMs can interpret complex environments while executing precise, context-aware behaviours. By leveraging such competencies, we aim to move beyond task-specific systems towards models that exhibit adaptability, robustness and the ability to reason across domains.

Through this structured exploration, NAVER LABS Europe positions itself at the forefront of FM-driven robotics, bridging theoretical AI advances with the practical demands of embodied agents. The result is a research direction that not only pushes technical boundaries but also lays the groundwork for robots that can operate autonomously and effectively in the open world, delivering tangible benefits in guidance, assistance and service applications.

Vision

Perception to help robots understand and interact with the environment.

Visual perception is a necessary part of any intelligent system that is meant to interact with the world. Robots need to perceive the structure, the objects, and people in their environment to better understand the world and perform the tasks they are assigned. We combine expertise in visual representation learning, self-supervised learning and human behaviour understanding to build AI components that help robots understand and navigate in their 3D environment, detect and interact with surrounding objects and people and continuously adapt themselves when deployed in new environments.

3D Foundation Models

Unified 3D vision models such as our DUSt3R/MASt3R frameworks that integrate tasks like depth, pose and reconstruction into a single transformer-based framework to dramatically simplify scene understanding.

Human Centric Computer Vision

Visual models that reliably perceive and predict human pose, shape and activity from images or video, enabling safer, more natural human–robot interaction and 3D human generation.

Lifelong Learning for Visual Representation

Building visual perception systems to adapt continuously to new environments and tasks without forgetting past knowledge whilst unifying encoders for effective embodied AI.

Visual Localization

Advancing robust camera pose estimation by matching images to 3D maps and supporting the community with toolkits and datasets for location-based systems such as self-driving cars, autonomous robots or AR/VR.

Action

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

To make robots autonomous in real-world everyday spaces, they should be able to learn from their interactions within these spaces, how to best execute tasks specified by non-expert users in a safe and reliable way. To do so requires sequential decision-making skills that combine machine learning, adaptive planning and control in uncertain environments as well as solving hard combinatorial optimization problems. Our research combines expertise in reinforcement learning, computer vision, robotic control, sim2real transfer, large multimodal foundation models and neural combinatorial optimization to build AI-based architectures and algorithms to improve robot autonomy and robustness when completing everyday complex tasks in constantly changing environments.

Neural Combinatorial Optimization for Robot Fleet Management

Creating solutions to challenging combinatorial optimization problems for the coordination and management of robot fleets delivering services in real environments and the services they deliver.

Foundation Models for Robot Navigation

End-to-end foundation models that enable robots to navigate diverse real-world environments without prior maps or special setup by modelling realistic agent behaviour and dynamics during learning.

Interaction

Equip robots to interact safely with humans, other robots and systems.

For a robot to be useful it must be able to represent its knowledge of the world, share what it learns and interact with other agents, in particular humans. Our research in HRI, NLP, speech, IR, data management and low code/no code programming is targeted at building AI components to help robots perform complex real-world tasks. These components help the robots safely interact with humans and their physical environment, with other robots and systems and represent, update and share their world knowledge.

RESEARCH

Vision

Perception to help robots understand and interact with the environment.

3D Foundation Models

Human Centric Computer Vision

Lifelong Learning for Visual Representation

Visual Localization

Action

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

Neural Combinatorial Optimization for Robot Fleet Management

Foundation Models for Robot Navigation

Interaction

Equip robots to interact safely with humans, other robots and systems.

Multimodal NLP for Robotics

Socially Aware Robot Navigation

LLMs for Robotics

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2025

All

Publications

Blog

News

Code & Data

Careers

People

RESEARCH

Vision

Perception to help robots understand and interact with the environment.

Action

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

Interaction

Equip robots to interact safely with humans, other robots and systems.

Cookie settings