Data, code and models released by NAVER LABS Europe


SMPLy benchmarking 3D human pose estimation in the wild.

Benchmark associated with the 3DV2020 paper of the same name.

Virtual KITTI 2

A dataset of synthetic images for training and testing based on KITTI (version 2 and 1.3.1).

Updated photo-realistic synthetic video dataset designed to learn and evaluate computer vision models for several video understanding tasks: object detection and multi-object tracking, scene-level and instance-level semantic segmentation, optical flow, and depth estimation.


Motion-Augmented RGB Stream for Action Recognition.

A strategy to learn a stream that takes only RGB frames as input but leverages both appearance and motion information from them.

To Annotate or Not?

Domain shift prediction

A method to predict the drop in accuracy of a trained model.


Understanding human action recognition out of context.

713 video clips from YouTube of mimed actions for a subset of 50 classes from the Kinetics400 dataset.

Mallscape datasets

A system that correctly detects when places have changed to automatically update complex indoor maps. Datasets available for research.

Datasets addresses all possible POI change scenarios to automatically update complex indoor maps.


Reliable and Repeatable Detector and Descriptor.

Benchmarked on classic feature matching benchmarks (HPatches) and challenging visual localization datasets.


A Functional, Imperative and Logical programming language for data annotation and augmentation.

A open source programming language to help create, annotate and augment corpora and data

Virtual gallery dataset

Synthetic dataset of a realistic scenario that simulates the scene captured by a robot equipped with 6 cameras for training and photos taken by visitors for testing.

Targets challenges such as varying lighting conditions and different occlusion levels for tasks such as depth estimation, instance segmentation and visual localization.

Aspect Based Sentiment Analysis (ABSA) dataset

Manually annotated ABSA dataset from Foursquare comments.

585 samples (1006 sentences) randomly selected and annotated with the SemEval2016 annotation guidelines for the restaurant domain.


A lightweight library to deal with 3D rotations in PyTorch.

Theoretical and experimental findings to improve regression applications.

Deep image retrieval

End-to-end learning of deep visual representations for image retrieval.

Repository contains models and evaluation scripts of papers ‘End-to-end Learning of Deep Visual Representations for Image Retrieval’ & ‘Learning with Average Precision: Training Image Retrieval with a Listwise Loss’.

PHAV: Procedural Human Action Videos

A diverse, realistic and physically plausible dataset of human action videos.

Contains 39,982 videos, with more than 1,000 examples for each action of 35 categories.

This web site uses cookies for the site search, to display videos and for aggregate site analytics.

Learn more about these cookies in our privacy notice.


Cookie settings

You may choose which kind of cookies you allow when visiting this website. Click on "Save cookie settings" to apply your choice.

FunctionalThis website uses functional cookies which are required for the search function to work and to apply for jobs and internships.

AnalyticalOur website uses analytical cookies to make it possible to analyse our website and optimize its usability.

Social mediaOur website places social media cookies to show YouTube and Vimeo videos. Cookies placed by these sites may track your personal data.