Computer Vision research - NAVER LABS Europe


The computer vision team conducts research in a wide range of areas, including visual search, scene parsing, human sensing, action recognition, pose estimation and lifelong learning.



  • The team has 3 papers accepted at CVPR 2021
  • Paper accepted at ICLR2021 on progressive skeletonization – network pruning at initialization (on openreview)
  • Paper with IRI & Univ. Aalto on multi-finger grasping accepted at ICRA 2021 (arXIv preprint)




computer vision image

Our work covers the spectrum from unsupervised to supervised approaches, and from very deep architectures to very compact ones. We’re excited about the promise of big data to bring big performance gains to our algorithms but also passionate about the challenge of working in data-scarce and low-power scenarios. Our driving goal is to use our research to deliver ambient visual intelligence to our users in autonomous driving, robotics, via phone cameras and any other visual means to reach people wherever they may be.

Our research combines skills in machine learning, pattern recognition and computer vision, and we work on multi-disciplinary problems with teams specialised in natural language processing, user experience, ethnography, design and more. Our research efforts may be either long-term in focus, or may tackle problems with concrete and immediate relevance to NAVER products and services. We’re very active in the computer vision community and our research is often pursued in collaboration with external partners from government and academia.

Learning Visual Representations with Caption Annotations
A new modeling task masks tokens in image captions to enable mid-sized sets of captioned images to rival large-scale labelled image sets for learning generic visual representations. Blog article by Diane Larlus
Dope featured image
A novel efficient model for whole-body 3D pose estimation (including bodies, hands and faces), that is trained by mimicking the output of hand-, body- and face-pose experts. Blog article by Philippe Weinzaepfel
The short memory of artificial neural networks
A research overview of current work in lifelong learning. Blog article by Riccardo Volpi
A first-of-its-kind architecture that, based on a single image, predicts how a robot can pick up objects from within any scene could revolutionize applications in AR/VR and robotics. Blog article by Gregory Rogez
Naver Labs Europe is leading a chair on Lifelong Representation Learning as part of the MIAI institute (Multidisciplinary Institute in Artificial Intelligence)
Learning Visual Representations with Caption Annotations (European Conference on Computer Vision (ECCV 2020 paper)

Recent Publications:

Computer Vision team:

Jon Almazan
Fabien Baradel
Juliette Bertrand
PhD candidate
Pau De Jorge
PhD candidate
Ginger Delmas
Diane Larlus
Thomas Lucas
Jerome Pouyadou
Gregory Rogez
Group lead
Mert Bulent Sariyildiz
Anilkumar Swamy
Vadim Tschernezki
PhD candidate
Riccardo Volpi

Related Content