The computer vision team conducts research in a wide range of areas, including visual search, scene parsing, human sensing, action recognition, pose estimation and lifelong learning.





Our work covers the spectrum from unsupervised to supervised approaches, and from very deep architectures to very compact ones. We’re excited about the promise of big data to bring big performance gains to our algorithms but also passionate about the challenge of working in data-scarce and low-power scenarios. Our driving goal is to use our research to deliver ambient visual intelligence to our users in autonomous driving, robotics, via phone cameras and any other visual means to reach people wherever they may be.

Our research combines skills in machine learning, pattern recognition and computer vision, and we work on multi-disciplinary problems with teams specialised in natural language processing, user experience, ethnography, design and more. Our research efforts may be either long-term in focus, or may tackle problems with concrete and immediate relevance to NAVER products and services. We’re very active in the computer vision community and our research is often pursued in collaboration with external partners from government and academia.

Learning Visual Representations with Caption Annotations
A new modeling task masks tokens in image captions to enable mid-sized sets of captioned images to rival large-scale labelled image sets for learning generic visual representations. Blog article by Diane Larlus
Dope featured image
A novel efficient model for whole-body 3D pose estimation (including bodies, hands and faces), that is trained by mimicking the output of hand-, body- and face-pose experts. Blog article by Philippe Weinzaepfel
The short memory of artificial neural networks
A research overview of current work in lifelong learning. Blog article by Riccardo Volpi
A first-of-its-kind architecture that, based on a single image, predicts how a robot can pick up objects from within any scene could revolutionize applications in AR/VR and robotics. Blog article by Gregory Rogez
Naver Labs Europe is leading a chair on Lifelong Representation Learning as part of the MIAI institute (Multidisciplinary Institute in Artificial Intelligence)
Learning Visual Representations with Caption Annotations (European Conference on Computer Vision (ECCV 2020 paper)

Recent Publications:

Computer Vision team:

