The GALLU project: Further Speech Recognition Work
Thanks to the support of the Welsh Government and S4C, the Language Technologies Unit now has a new project to research Welsh language speech recognition.
The aim of the GALLU project is to develop further speech recognition resources for Welsh. Building on the Basic Speech Recognition project of 2008-09, this project aims to:
- build a new Welsh speech corpus through crowdsourcing means, involving a large number of diverse people reading the recording scripts (we need a large number of different voices and accents in order to improve speech recognition)
- create a plugin to check and confirm the browser’s default locale for the crowdsourcing website
- create a language register typology with appropriate matadata on a trained corpus, tagged according to features of different registers
- develop a simple script to control the movements of a robotic toy for the Raspberry Pi
- develop a LVCSR system for Welsh, creating an open source Welsh speech recognition system within Julius
Would you like to help?
We are looking for volunteers to record their voices for the corpus. You will not need to come to a studio, just use your smartphone, tablet, or PC with a mike.
We are looking for all kinds of voices: old, middle-aged, young, strong, weak, from south, mid and north Wales, fluent speakers, learners, men, women, children….
We will be betatesting our new crowdsourcing app at the Hacio’r Iaith event at Bangor, February 15. If you are there, come and see us and give us the UDID for your iPad (go to http://whatsmyudid.com/ for help on how to find it). There will also be an Android version, but that does not need a UDID.