|Jama Hussein Mohamud, Aissatou Ndoye, Loyd Thomson, Laurent Besacier|
|AfricaNLP Workshop at the European Chapter of the Association for Computational Linguistics conference (EACL), virtual event, 19 April, 2021|
This paper describes the results of an informal collaboration launched during the African Master of Machine Intelligence (AMMI) in June 2020. After a series of lectures and labs on speech data collection using mobile applications and on self-supervised representation learning from speech, a small group of students and the lecturer continued working on automatic speech recognition (ASR) project for three languages: Wolof, Ga, and Somali. This paper describes how data was collected and ASR systems developed with a very small amount (1h) of transcribed speech as training data. In these low resource conditions, pre-training a model on large amounts of raw speech was fundamental for the efficiency of ASR systems developed.