COMPUTER VISION
The computer vision team conducts research in a wide range of areas, including visual search, scene parsing, human sensing, action recognition, pose estimation and lifelong learning.
Highlights
2022
2021
- The team has 2 papers at 3DV 2021
- Co-organizing the ‘ImageNet: past, present and future’ workshop at NeurIPS 2021
- Paper accepted at ICCV 2021, Concept Generalization in Visual Representation Learning
- The team has 3 papers accepted at CVPR 2021 including an oral, and a findings paper in the Continual Learning workshop.
- Paper accepted at ICLR2021 on progressive skeletonization – network pruning at initialization (on openreview)
- Paper with IRI & Univ. Aalto on multi-finger grasping accepted at ICRA 2021 (arXIv preprint)
- Philippe Weinzaepfel & Grégory Rogez have a paper on understanding human action out-of-context and the Mimetics dataset published in IJCV. See Blog.
- co-organizers of the PAISS Summer School 2021
- Diane Larlus and Yannis Kalantidis are serving as Area Chairs for CVPR 2021 and ICCV 2021.
- We have 2 papers at WACV 2021.
2020
- The team has a paper at 3DV 2020
- The team has 3 papers accepted at NeurIPS 2020
- Winner of the Koenderink test of time paper award at ECCV 2020
- The team has 4 papers accepted at ECCV 2020
- Grégory Rogez is co-editor of the IJCV Special Issue on Human Pose, Motion, Activities and Shape in 3D
- 1 paper (oral) accepted at CVPR 2020. See Blog.
- Diane Larlus is serving as Area Chair for CVPR 2020 and ECCV 2020
- The paper by César de Souza et al. on “Generating Human Action Videos by Coupling 3D Game Engines and Probabilistic Graphical Models“, was accepted for publication in IJCV.
- Yannis Kalantidis is organizing the Computer Vision for Agriculture (CV4A) workshop at ICLR 2020.
- Diane Larlus is serving as Industrial Liaison Chair for ECCV 2020

Our work covers the spectrum from unsupervised to supervised approaches, and from very deep architectures to very compact ones. We’re excited about the promise of big data to bring big performance gains to our algorithms but also passionate about the challenge of working in data-scarce and low-power scenarios. Our driving goal is to use our research to deliver ambient visual intelligence to our users in autonomous driving, robotics, via phone cameras and any other visual means to reach people wherever they may be.
Our research combines skills in machine learning, pattern recognition and computer vision, and we work on multi-disciplinary problems with teams specialised in natural language processing, user experience, ethnography, design and more. Our research efforts may be either long-term in focus, or may tackle problems with concrete and immediate relevance to NAVER products and services. We’re very active in the computer vision community and our research is often pursued in collaboration with external partners from government and academia.
Recent Publications:
- On the road to online adaptation for semantic image segmentation, Riccardo Volpi, Pau De Jorge, Gabriela Csurka Khedari, Diane Larlus, CVPR, New Orleans, Louisiana USA, 19-24 June , 2022
- Deep visual geo-localization benchmark (oral), Gabriele Berton, Riccardo Mereu, Gabriele Trivigno, Carlo Masone, Gabriela Csurka Khedari, Torsten Sattler, Barbara Caputo, CVPR, New Orleans, Louisiana USA, 19-24 June , 2022
- PUMP: pyramidal and uniqueness matching priors for unsupervised learning of local features, Jérome Revaud, Vincent Leroy, Philippe Weinzaepfel, Boris Chidlovskii, CVPR, New Orleans, Louisiana USA, 19-24 June , 2022
- An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments, Assem Sadek, Guillaume Bono, Boris Chidlovskii, Christian Wolf, ICRA, Philadelphia, USA, 23-27 May, 2022
- ARTEMIS: attention-based retrieval with text-explicit matching implicit similarity, Ginger Delmas, Rafael Sampaio de Rezende, Gabriela Csurka, Diane Larlus, ICLR, virtual-only event, 25-29 April, 2022.
-
Learning super-features for image retrieval, Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, Yannis Kalantidis, ICLR, virtual-only event, 25-29 April, 2022.
- Learning with label noise for image retrieval by selecting interactions, Sarah Ibrahimi, Arnaud Sors, Rafael Sampaio de Rezende and Stéphane Clinchant, WACV, Waikoloa Hawaii, 4-8 January, 2022
- Concept generalization in visual representation learning, Mert Bulent Sariyildiz, Yannis Kalantidis, Diane Larlus, Karteek Alahari, International Conference on Computer Vision (ICCV), virtual-only conference, 11-17 October, 2021
- Probabilistic embeddings for cross-modal retrieval, Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus, Conference on Computer Vision and Pattern Recognition (CVPR), virtual-only conference, 19-25 June, 2021
- Continual adaptation of visual representations via domain randomization and meta-learning. Oral. Riccardo Volpi, Diane Larlus, Gregory Rogez, Conference on Computer Vision and Pattern Recognition (CVPR), virtual-only conference, 19-25 June, 2021
- Large-scale localization datasets in crowded indoor spaces, Donghwan Lee, Soohyun Ryu, Suyong Yeon, Yonghan Lee, Deokhwa Kim, Cheolho Han, Yohann Cabon, Philippe Weinzaepfel, Nicolas Guérin, Gabriela Csurka Khedari, Martin Humenberger, Conference on Computer Vision and Pattern Recognition (CVPR), virtual-only conference, 19-25 June, 2021
- Multi-FinGAN: generative coarse-to-fine sampling of multi-finger grasps, Jens Lundell, Enric Corona, Tran Nguyen Le, Francesco Verdoja, Philippe Weinzaepfel, Gregory Rogez, Francesc Moreno-Noguer, Ville Kyrki, IEEE International Conference on Robotics and Automation (ICRA), hybrid conference, Xi’an, China, 30 May-5 June, 2021
- Progressive skeletonization: trimming more fat from a network at initialization, Pau de Jorge, Amartya Sanyal, Harkirat Behl, Philip Torr, Gregory Rogez, Puneet Dokania, Ninth International Conference on Learning Representations (ICLR), virtual-only conference, 3-7 May, 2021
- Mimetics: Towards understanding human action out-of-context, Philippe Weinzaepfel and Grégory Rogez, International Journal of Computer Vision, volume 129, pages 1675–1690, 2021.
- Hard negative mixing for contrastive learning, Yannis Kalantidis, Mert Bulent Sariyildiz, Noé Pion, Philippe Weinzaepfel, Diane Larlus,
Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS), virtual-only conference, 6-12 December, 2020 - SuperLoss: a generic loss for robust curriculum learning, Thibault Castells, Philippe Weinzaepfel, Jérome Revaud
Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS), virtual-only conference, 6-12 December, 2020 - DOPE: distillation of part experts for whole-body 3D pose estimation in the wild, Philippe Weinzaepfel, Romain Brégier, Hadrien Combaluzier, Vincent Leroy, Gregory Rogez
European Conference on Computer Vision (ECCV), Glasgow, UK (virtual event), 23-28 August, 2020 - Learning visual representations with caption annotations, Mert Bulent Sariyildiz, Julien Perez, Diane Larlus
European Conference on Computer Vision (ECCV), Glasgow, UK (virtual event), 23-28 August, 2020 - Key protected classification for collaborative learning, Mert Bulent Sariyildiz, Ramazan Gokberk Cinbis, Erman Ayday
Pattern Recognition, Volume 104, August 2020, 107327 - GanHand: predicting human grasp affordances in multi-object scenes, Enric Corona, Albert Pumarola, Guillem Alenyà, Francesc Moreno-Noguer, Gregory Rogez
Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, 16-18 June, 2020 - Progressive skeletonization: trimming more fat from a network at initialization, Pau De Jorge, Amartya Sanyal, Harkirat Behl, Philip Torr, Gregory Rogez, Puneet Dokania
Published on arXiv.org - Robust image retrieval-based visual localization using kapture, Martin Humenberger, Yohann Cabon, Nicolas Guerin, Julien Morat, Jérome Revaud, Philippe Rerole, Noé Pion, Cesar Roberto De Souza, Vincent Leroy, Gabriela Csurka Khedari
Published on arXiv.org
Computer Vision team:
[ultimatemember form_id=”9347138″]