A unified data format for visual localization, structure from motion and more.
If you work on visual localization, you’re often faced with the fact that many public datasets are provided in different formats. This means adapting data importers and exporters and almost always having to transform coordinate systems or camera parameters. Furthermore, if you want to use a combination of multiple methods in a single pipeline, a lot of data conversion is required. And, even if there are great tools like OpenMVG or COLMAP available, the provided data formats often don’t include everything you may need, such as wifi or other sensor data.
To address these issues and make it easier to use public datasets we created the kapture data format and toolbox. We hope kapture will facilitate future research and development in visual localization, structure-from-motion, VSLAM, and sensor fusion.
Kapture is a data format used to describe data acquired for the applications of structure from motion (SFM) and visual localization.
It can be used to store:
- sensor parameters such as intrinsic and extrinsic camera parameters,
- raw sensor data such as camera images or lidar data,
- other sensor data such as GPS or WIFI signals,
- computed data such as:
- 2D local features (keypoints and descriptors)
- 2D-2D matches between local features
- global features (e.g. for image retrieval)
- 3D reconstructions consisting of 3D points and keypoint observations.
Secondly, kapture is a set of Python tools to load, save, and convert datasets to and from Kapture.
Thirdly, we provide a set of public datasets pre-converted to kapture. If you already have your SFM or visual localization processing tools up and running, you just need to integrate kapture support once after which you can use all the datasets without any more conversion or glue code writing.
Finally, kapture is a visual localization toolbox that consist of various tools to create maps and to re-localize images. The provided structure-based methods define the current state-of-the-art on many public datasets (see Online Benchmark below).
- The main purpose of kapture is to provide a unified data format for your SFM and visual localization datasets. This will facilitate processing different datasets as well as sharing processed data (e.g. features).
- To convert datasets to and from kapture, we provide a set of converters for popular formats (e.g. COLMAP, bundler, nvm, OpenMVG, OpenSfM, and more).
- We provide various visual localization pipelines based on COLMAP, pycolmap, and other tools (see kapture-localization).
- The feature set of kapture is constantly growing. Please follow the source code to stay up to date.
Version 1.1.0 (May 2021)
kapture now supports multiple types of local and global features in a single kapture representation and the use of .tar archives for more efficient data handling with respect to features and matches.
- Robust image retrieval-based visual localization using kapture. arXiv
- Large-scale localization datasets in crowded indoor spaces. CVPR21, arXiv, blog, dataset
- Benchmarking image retrieval for visual localization. 3DV20, arXiv
- Investigating the role of image retrieval for visual localization – an exhaustive benchmark. IJCV22, arXiv
- On the limits of pseudo ground truth in visual camera re-localization. ICCV21, arXiv, code
Contribute to kapture:
If you find kapture useful, we invite you to contribute!
For example you can:
- Provide your own dataset in kapture format (we’re happy to help).
- Write new data converters.
- Report bugs and improvements
- Provide processed data (e.g. extracted features or matches) in kapture format.
- Add support for other kinds of data not currently supported
- Tell us what you think!
- NAVER LABS localization datasets*
- Aachen Day Night v1.1*
- Extended CMU-Seasons
- RobotCar Seasons v2
- InLoc (without images)
- SILDa Weather and Time of Day
- Virtual Gallery dataset
- Baidu-mall (without images)
- Symphony Lake
- 7scenes (incl. RGBD)*
- 12scenes (incl. RGBD)
- Cambridge Landmarks*
* including one-script pipeline examples