The first Maltese speech recognition software has been launched online in primitive form, in the hope its widespread use will allow for speedier development of the technology.
Malta is lagging behind when it comes to speech-to-text technology, with Maltese absent among the 125 languages and variants offered by Google’s speech recognition services. It is also missing from similar services offered by Amazon and Microsoft.
“In the past there have been several disjointed attempts to build this kind of software but, unfortunately, there were no concrete results,” the founder of bilmalti.com, Edward Apap, told Times of Malta.
“This is why I decided to make my software public right from the start. Through user data and feedback, improvements to the software will be made in the areas that matter most to the users,” he explained.
Apap, a trained engineer working in computer software, said the importance of this technology could not be overstated.
Speech-to-text systems are integral to applications aiding the visually- and hearing-impaired, voice-operated telephone menu systems and virtual assistants, among others.
At the University of Malta, academics have also been working at developing a speech-recognition system in Maltese, in a project called Masri.
Senior lecturer of language technology Albert Gatt announced they would be releasing the data sets and a demo of the system in the coming weeks.
We laid the groundwork for other people to continue
“We are not claiming that we have solved the problem. We have a system which is reasonable but still error-prone.
“The important thing is we have laid the groundwork for other people to come along and continue,” he said.
He pointed out that once the data is released, it will benefit projects like bilmalti.com, which could build on the work Masri has already done.
The accuracy of machine learning-based systems generally correlates with the size of the dataset they are trained on.
“Hopefully, as more people start using the bilmalti.com, accuracy reaches a threshold where the software becomes useful for general purpose applications,” Apap said.
At present, the system does a much better job transcribing news items when compared to spontaneous speech or casual conversations.
“Although it can handle thousands of English words frequently used in everyday Maltese speech, it struggles with the intonation and choice of words in colloquial speech,” he added.
Apap pointed out since there is very little data available for the Maltese language, the task of building a speech recognition system is harder than it is for the typical language.
“This is one of the reasons why big tech haven’t stepped up for the job – datasets to feed their existing machine learning systems are not readily available,” he said.