Speech-MASSIVE: a multilingual speech dataset for SLU and beyond

Published by Laurent Besacier at 1 September 2024

Beomseok Lee, Marco Gaido, Matteo Negri, Ioan Calapodescu, Laurent Besacier

The 25th Interspeech Conference, Kos Island, Greece, 1-5 September, 2024

This paper presents Speech-MASSIVE, a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSIVE textual corpus. Speech-MASSIVE covers 12 languages from different families and inherits from the original MASSIVE dataset the annotations for the intent prediction and slot filling tasks. Our extension is prompted by the scarcity of massively multilingual SLU datasets and the growing need for versatile speech datasets to assess foundation models (LLMs, speech encoders) across diverse languages and tasks. To fill this gap, in addition to releasing a multimodal, multi-task, and multilingual dataset, we report SLU baselines obtained with cascade and end-to-end SLU architectures trained in different scenarios (zero-shot, few-shot, and full training). Furthermore, we demonstrate the suitability of Speech-MASSIVE for other tasks such as speech transcrip- tion, language identification, and speech translation.

@misc{lee2024speechmassivemultilingualspeechdataset,
      title={Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond}, 
      author={Beomseok Lee and Ioan Calapodescu and Marco Gaido and Matteo Negri and Laurent Besacier},
      year={2024},
      eprint={2408.03900},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.03900}, 
}

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2025

All

Publications

Blog

News

Code & Data

Careers

People

Speech-MASSIVE: a multilingual speech dataset for SLU and beyond

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings