Data, code and models released by NAVER LABS Europe


Whole-body human mesh recovery of multiple persons from a single image.

A simple yet effective single-shot method to detect multiple people in an image and estimate their pose, body shape and expression. Training and demo code.


Benchmarking Object-agnostic Hand-Object 3D Reconstruction

The SHOWMe dataset comprises 96 videos with their associated high-quality textured meshes of a hand holding an object.


A multi-subject 4D dataset of human motion sequences in varying outfits exhibiting large displacements.

Collaboration with INRIA.


Correcting 3D human poses with natural language.

The PoseFix dataset consists of several thousand paired 3D poses and corresponding text feedback that describes how the source pose needs to be modified to obtain the target pose.


A novel, plug and play model for human 3D shape estimation in videos.

Model trained by mimicking the BERT algorithm from the natural language processing community.


Quantization-based 3D human motion generation and forecasting.

An auto-regressive transformer-based approach which internally compresses human motion into quantized latent sequences.


3D human poses from natural language.

A dataset pairing 3D human poses with both automatically generated and human-written descriptions.


Distillation of Part Experts for whole-body 3D pose estimation in the wild.

A novel, efficient model for whole-body 3D pose estimation (including bodies, hands and faces),  trained by mimicking the output of hand-, body- and face-pose experts.

LCR-Net release V2.0

Localization Classification Regression for human pose.

Improved pose proposals integration for multi-person 2D and 3D pose detection in natural images.


SMPLy benchmarking 3D human pose estimation in the wild.

Benchmark associated with the 3DV2020 paper of the same name.


Motion-Augmented RGB Stream for Action Recognition.

A strategy to learn a stream that takes only RGB frames as input but leverages both appearance and motion information from them.


Understanding human action recognition out of context.

713 video clips from YouTube of mimed actions for a subset of 50 classes from the Kinetics400 dataset.

PHAV: Procedural Human Action Videos

A diverse, realistic and physically plausible dataset of human action videos.

Contains 39,982 videos, with more than 1,000 examples for each action of 35 categories.

This web site uses cookies for the site search, to display videos and for aggregate site analytics.

Learn more about these cookies in our privacy notice.

Cookie settings

You may choose which kind of cookies you allow when visiting this website. Click on "Save cookie settings" to apply your choice.

FunctionalThis website uses functional cookies which are required for the search function to work and to apply for jobs and internships.

AnalyticalOur website uses analytical cookies to make it possible to analyse our website and optimize its usability.

Social mediaOur website places social media cookies to show YouTube and Vimeo videos. Cookies placed by these sites may track your personal data.