MASt3R-SfM: a fully-integrated solution for unconstrained Structure-from-Motion

Published by Philippe Weinzaepfel at 25 March 2025

Bardienus Duisterhof, Lojze Zust, Philippe Weinzaepfel, Vincent Leroy, Yohann Cabon, Jérome Revaud

The 12th International Conference on 3D Vision (3DV), Singapore, 25-28 March, 2025

Structure-from-Motion (SfM), a task aiming at jointly recovering camera poses and 3D geometry of a scene given a set of images, remains a hard problem with still many open challenges despite decades of significant progress. The traditional solution for SfM consists of a complex pipeline of minimal solvers which tends to propagate errors and fails when images do not sufficiently overlap, have too little motion, etc. Recent methods have attempted to revisit this paradigm, but we empirically show that they fall short of fixing these core issues. In this paper, we propose instead to build upon a recently released foundation model for 3D vision that can robustly produce local 3D reconstructions and accurate matches. We introduce a low-memory approach to accurately align these local reconstructions in a global coordinate system. We further show that such foundation models can serve as efficient image retrievers without any overhead, reducing the overall complexity from quadratic to linear. Overall, our novel SfM pipeline is simple, scalable, fast and truly unconstrained, \ie it can handle any collection of images, ordered or not. Extensive experiments on multiple benchmarks show that our method provides steady performance across diverse settings, largely outperforming existing methods in small- and medium-scale settings*.

*This paper received the Best Student Paper Award at 3DV 2025

@inproceedings{
duisterhof2025mastrsfm,
title={{MAS}t3R-SfM: a Fully-Integrated Solution for Unconstrained Structure-from-Motion},
author={Bardienus Pieter Duisterhof and Lojze Zust and Philippe Weinzaepfel and Vincent Leroy and Yohann Cabon and Jerome Revaud},
booktitle={International Conference on 3D Vision 2025},
year={2025},
url={https://openreview.net/forum?id=5uw1GRBFoT}
}

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2025

All

Publications

Blog

News

Code & Data

Careers

People

MASt3R-SfM: a fully-integrated solution for unconstrained Structure-from-Motion

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings