Retrieval-augmented generation in multilingual settings
Retrieval-augmented generation (RAG) in the multilingual setting (mRAG). Our findings highlight that despite the availability of high-quality off-the-shelf multilingual retrievers and generators, task-specific prompt engineering is needed to enable generation in user languages. Moreover, current evaluation metrics need adjustments for multilingual setting, to account for variations in spelling named entities.
BERGEN: benchmarking RAG
Designed to ease the reproducibility and integration of new datasets and models and identify strong baselines.
Several releases: SPLADE V-2, SPLADE V-3, CoSPLADE etc.
SPLADE is sparse bi-encoder BERT-based model for effective and efficient first-stage ranking.