Spire
Spire translates and transcribes speech input from English into 10 other languages as well as translating text input in both language directions. Spire is an output of the EU project UTTER (Unified Transcription and Translation for Extended Reality).
LPOSS
A training-free method for open-vocabulary semantic segmentation using Vision-and-Language Models (VLMs).
LLM-as-a-qualitative-judge
LLM-as-a-qualitative-judge correctly recognizes instance-specific issues in 2/3 cases and is capable of producing error type reports resembling the reports composed by human annotators.
GUARD
A principled approach to enforcing strict guarantees for LLMs without compromising their generative capabilities combining an autoregressive proposal distribution with rejection sampling.
OSCAR
A novel query-dependent online soft compression method for RAG that reduces computational overhead while preserving performance. Unlike traditional hard compression methods, which shorten retrieved texts, or soft compression approaches, which map documents to continuous embeddings offline, OSCAR dynamically compresses retrieved information at inference time, eliminating storage overhead and enabling higher compression rates.
Speech-MASSIVE
Covers 12 languages from different families and inherits from the original MASSIVE dataset the annotations for the intent prediction and slot filling tasks. See also the Interspeech 2024 paper.
Can be used for machine translation, speech translation, language modeling and dialogue supporting a number of popular pre-trained models.
A toolkit for controlling language models and other generative models.
These prompted datasets to benchmark the ability of a model to perform completely unseen tasks specified in natural language.
These prompted datasets to benchmark the ability of a model to perform completely unseen tasks specified in natural language.
A general framework for imposing constraints on samples of pretrained language models



