kapture is a file format as well as a set of tools for manipulating datasets, and in particular Visual Localization and Structure from Motion data.
A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"