Bangor University has just developed new training scripts and models that bring together the various features of DeepSpeech, along with CommonVoice data, and provides a complete solution for producing models and scorers for Welsh language speech recognition. They may be of interest to any other users of DeepSpeech that are working with a similarly lesser resourced language to Welsh.

The scripts:

  • are based on DeepSpeech 0.7.4
  • make use of DeepSpeech’s Dockerfiles (so setup and installation is easier).
  • train with CommonVoice data
  • utilize transfer learning
  • with some additional test sets and corpora, produce optimized scorers/language models for various applications
  • exports models with metadata

The initial README 4 describes how to get started.

We’d like to share also the models that are produced from these scripts which can be found at https://github.com/techiaith/docker-deepspeech-cy/releases/tag/20.06 4

At the moment these models are used in two prototype applications which the Welsh speaking community can install and try, namely a Windows/C# based transcriber and an Android/iOS voice assistant app 1 called Macsen. Source code for these applications using DeepSpeech can also be found on GitHub.

We are immensly grateful to Mozilla for creating the Common Voice and DeepSpeech projects.