MARS - Naver Labs Europe

MARS: Motion-Augmented RGB Stream for Action Recognition

Code

The test code and models are released under the MIT license.

Both the code and models are on github.

Publication and blog

The CVPR 2019 publication related to the codes and models is MARS: Motion-Augmented RGB Stream for Action Recognition [PDF]

Authors: Nieves Crasto¹, Philippe Weinzaepfel¹, Karteek Alahari², Cordelia Schmid^{2 [¹NAVER LABS Europe ²Inria]}

Look at it differently – there’s also a quick read blog article on MARS

Information related to code and models

Most state-of-the-art methods for action recognition consist of a two-stream architecture with 3D convolutions: an appearance stream for RGB frames and a motion stream for optical flow frames. Although combining flow with RGB improves the performance, the cost of computing accurate optical flow is high, and increases action recognition latency. This limits the usage of two-stream approaches in real-world applications requiring low latency. In this paper, we introduce two learning approaches to train a standard 3D CNN, operating on RGB frames, that mimics the motion stream, and as a result avoids flow computation at test time. First, by minimizing a feature-based loss compared to the Flow stream, we show that the network reproduces the motion stream with high fidelity. Second, to leverage both appearance and motion information effectively, we train with a linear combination of the feature-based loss and the standard cross-entropy loss for action recognition. We denote the stream trained using this combined loss as Motion-Augmented RGB Stream (MARS). As a single stream, MARS performs better than RGB or Flow alone, for instance with 72.7% accuracy on Kinetics compared to 72.0% and 65.6% with RGB and Flow streams respectively.

Citation

@inproceedings{crasto2019mars,
 title={{MARS: Motion-Augmented RGB Stream for Action Recognition}},
 author={Crasto, Nieves and Weinzaepfel, Philippe and Alahari, Karteek and Schmid, Cordelia},
 booktitle={CVPR},
 year={2019}
}

MARS: Motion-Augmented RGB Stream for Action Recognition

Code

Publication and blog

Information related to code and models

Citation

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

MARS: Motion-Augmented RGB Stream for Action Recognition

Code

Publication and blog

Information related to code and models

Citation

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings