3D VISION
In a connected world of people, robots and self-driving vehicles, we naturally need to have a good understanding of the 3D world we live in.
Highlights
2020
New Image Retrieval for Visual Localization Benchmark online.
New Kapture toolbox online.
4 outstanding reviewer awards: 2xCVPR (Martin Humenberger, Jérôme Revaud), 1xNeurIPS (Jérôme Revaud), 1x3DV (Martin Humenberger)
2 papers accepted at 3DV 2020:
- Benchmarking Image Retrieval for Visual Localization by Noé Pion, Martin Humenberger, Gabriela Csurka, Yohann Cabon, Torsten Sattler
- SMPLy Benchmarking 3D Human Pose Estimation in the Wild by Vincent Leroy, Philippe Weinzaepfel, Romain Brégier, Hadrien Combaluzier, Gregory Rogez
2 papers accepted at NeurIPS 2020:
- SuperLoss: A Generic Loss for Robust Curriculum Learning by Thibault Castells, Philippe Weinzaepfel, Jerome Revaud
- Hard Negative Mixing for Contrastive Learning by Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus
Volume Sweeping: Learning Photoconsistency for Multi-View Shape Reconstruction by Vincent Leroy, Jean-Sebastien Franco, Edmond Boyer was accepted at IJCV
2nd place in the Long-Term Visual Localization under Changing Conditions held at ECCV20 (paper).
Kapture: Release of a unified data format and processing pipeline for structure from motion and visual localization (paper).
Adversarial Transfer of Pose Estimation Regression by Boris Chidlovskii, Assem Sadek accepted for ECCV 2020 TASK-CV Workshop
2 papers accepted for ECCV20:
- DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild by Philippe Weinzaepfel, Romain Brégier, Hadrien Combaluzier, Vincent Leroy, Gregory Rogez
- Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction (various authors, among them Philippe Weinzaepfel, Romain Brégier, Gregory Rogez).
Paper on Self-Supervised Attention Learning for Depth and Ego-motion Estimation by Assem Sadek and Boris Chidlovskii accepted for IROS20. (pdf)
The IPIN 2019 Indoor Localisation Competition – Description and Results by F. Potorti, B. Chidlovskii, L. Antsfeld, et al. accepted for IEEE Access Journal
1st, 2nd and 4th in the CVPR Long Term Visual Localization Challenge.
Paper on Estimating Low-Rank Region Likelihood Maps by Gabriela Csurka, Zoltan Kato, Andor Juhasz, Martin Humenberger accepted for CVPR20. (pdf)
New version of the popular synthetic dataset Virtual KITTI available.
2019
- 1st Workshop on AI for Robotics at NAVER LABS Europe on November 28/29 (more info, videos of the talks, photos, etc. can be found here)
- Paper R2D2: Reliable and Repeatable Detector and Descriptor accepted as oral at NeurIPS 2019 (pdf, code).
- New paper on laveraging semantic segmentation for VSLAM accepted at Deep Learning for VSLAM workshop at ICCV (pdf, code).
- 2 papers accepted for ICCV 2019 (NAVER LABS total: 4): Learning with Average Precision: Training Image Retrieval with a Listwise Loss (PDF) and Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings
- New paper about wifi-based localization to appear at IPIN 2019
- New feature detector R2D2 wins Visual Localization Challenge at CVPR 2019 (paper on arxiv and publication database)
- Paper on Visual Localization at CVPR19
- Tutorial on structure-from-motion
- arXiv pre-print paper: “From handcrafted to deep local invariant features”
- Embedded Vision Workshop at CVPR19
2018
- Paper: “Vision-based autonomous feeding robot” at OAGM18
Related Content
Datasets

Results in the tasks related to this understanding such as 3D reconstruction, mapping and visual localization have been getting better and better.
In reconstructing the geometry of the world as accurately as possible, it’s common practice to use sensors such as LIDAR, radar and, of course, cameras. This is because geometry is pretty well understood and one needs to ‘measure the world’ for many applications. However, progress in only using geometry to solve 3D vision tasks has been declining and the methods that exist today are not sufficiently robust for everyday situations such as changing environments and weather conditions.
One reason for this lack of robustness is that not everything can be measured or described in a way a computer can reliably detect it. Furthermore, even if a scene were to be perfectly reconstructed, there’s no guarantee that a computer would understand, analyse and interpret it correctly.
A popular strategy of the computer vision community to overcome these problems, is to use machine learning techniques rather than hand-crafted approaches and their success has proven it to be a good choice. There have been some outstanding results in topics such as image categorization, image retrieval and object detection.
However, geometric properties constitute a significant part of the world and we believe they should not be neglected entirely in favour of learning.
Our strategy therefore combines both approaches: We want to learn what we cannot measure.
The research focus of the 3D Vision team lies on the design of methods which combine geometry and learning-based approaches to solve specific real-world challenges such as visual localization, camera pose estimation and 3D reconstruction. Examples for our target applications are robot navigation, indoor mapping, augmented reality (AR) and, more generally speaking, systems which enable ambient intelligence in day to day life.
3D Vision team: