Weatherproofing Retrieval for Localization with Generative AI & Geometric Consistency
Yannis Kalantidis*, Mert Bulent Sariyildiz*, Rafael S. Rezende, Philippe Weinzaepfel, Diane Larlus, Gabriela Csurka
ICLR 2024
* Equal contribution
Relative gains in localization accuracy
Compared to the state of the art (black dot), using our Ret4Loc models trained with real (Ret4Loc) or real+synthetic images (Ret4Loc+Synth).
Synthetic variants
For the training set images shown on the left.
Geometric correspondences
Ret4Loc in three figures: (left) overview of our experimental validation, (middle) synthetic variants used to extend the training set, and (right) geometric consistency used to select those variants. More precisely: (Left) Relative gains in localization accuracy compared to the state of the art (black dot) for 7 outdoor and 1 indoor datasets, achieved by our best retrieval models trained with our method using only real images (Ret4Loc) or real and synthetic images (Ret4Loc + Synth). Axes in log-scale. (Middle) Original images and several of their synthetic variants obtained for different prompts. (Right) Estimated local correspondences between two matching images before and after alteration. This is used to discard from the training set synthetic variants that fail that verification process.
Summary
Visual Localization Results
Figure-2: Localization accuracy as a function of the top-k retrieved images for Ret4Loc models and the state of the art. Results shown for two protocols: A Pose approximation protocol (EWB) and a Structure-from-Motion (SfM) based protocol. Ret4Loc-HOW-Synth variations using geometric consistency are denoted with a ”+”.
Place Recognition Results
Figure-3: Visual place recognition results. We report the usual metrics (top-k recall), i.e. if one correct image is retrieved in the top k. ∗ denotes results from GCL (Leyva-Vallina et al., 2023);
Pretrained Models
Here you can find links to Ret4Loc pretrained models. We built our codebase on top of the HOW codebase. You can use code from HOW to load and evaluate the Ret4Loc models.
We provide two model weights (33MB each):
ret4loc_how.pth
– the baseline Ret4Loc-HOW modelret4loc_how_synth-pp.pth
– our best Ret4Loc model trained with synthetic data and geometric verification.
You can load our models exactly like a HOW model.