The student using AI to turn speech into sign language

An MCAST student has developed an artificial intelligence-assisted project that uses a digital avatar to help translate spoken English into Maltese Sign Language in near real time.

Mac Patrick Gauci believes the early-stage technology could help bridge the communication gap between deaf and hearing people.

Through the system, a deaf person uses a screen or device displaying the avatar that converts the spoken word into sign language through the gestures it performs.

Mac Patrick Gauci edited the avatar to create realistic Maltese sign language gestures.

The result is a pioneering prototype that “brings conversations to life” through an animated figure, Gauci says.

While the current prototype shows promise, Gauci admits it struggles to grasp some conversations so it needs refinement for daily use.

“While it effectively translates basic sentences into LSM, it struggles with complex expressions, facial nuances and lip-reading accuracy,” he admitted, adding that future iterations addressing these limitations could make it more reliable.

The prototype programme runs on a computer or mobile device, such as a laptop, tablet, or smartphone. It handles both translation and visualisation, requiring no special equipment and gear for the deaf person other than the screen and a microphone or audio input to capture the spoken words of the hearing person, Gauci explained.

Any required input is handled by a microphone and a processing device, allowing the deaf person to interact naturally, he said.

At the moment, however, it is best suited for controlled scenarios like demonstrations or specific-use cases.

At this stage, the focus is on translating spoken English into Maltese sign language, he said. Existing research already interprets sign language, which is why Gauci prioritised the reverse – “to ensure a stronger contribution to an underexplored domain”.

The prototype programme runs on a computer or mobile device

In simple terms, the invention works with speech recognition with the microphone capturing spoken words from a hearing person. Next is translation, with AI processing the speech and converting it into LSM. Then, the 3D digital avatar performs the gestures on a screen, enabling the deaf person to follow along.

In more technical terms, the system is about integrating cutting-edge motion capture, automatic speech recognition (ASR) and natural language processing (NLP) technologies.

Gauci’s journey into AI-powered assistive technology began with meticulous planning.

The avatar was animated to reflect the nuances of LSM gestures accurately. As the heart of the system, continuous audio recording and transcription converts spoken English into text, which was then processed using sophisticated techniques to ensure precise and contextually appropriate translations.

A significant milestone was establishing “real-time data exchange for fluid animations”, while motion capture technology at MCAST’s Applied Research and Innovation Centre played a pivotal role in recording LSM gestures.

These were meticulously edited and re-targeted to the digital avatar, ensuring accurate and life-like sign language representation, Gauci explained.

Mac Patrick Gauci (centre) with his thesis mentor (left) and the interpreter from the Inclusive Education Unit at MCAST.

Communication is truly universal

The true test of the prototype came during structured user engagement sessions. Members of the Maltese deaf community and professional interpreters interacted with the technology, providing invaluable feedback.

Their insights highlighted the system’s strengths and areas that needed improvement.

Gauci said the feedback was “overwhelmingly positive, confirming the prototype’s potential in enhancing communication accessibility”.

Users appreciated the system’s ability to perform basic translations and its use of semi-realistic 3D animations for LSM gestures, he said.

However, they also emphasised the need for improvements to make sign language interpretation more natural and comprehensible.

“The ultimate aim is to achieve greater realism and naturalness in the avatar’s movements, ensuring the technology meets the diverse needs of the deaf community and brings people closer together,” Gauci said.

But the implications of his research extend far beyond that: “It sets the stage for global advancements in assistive technologies, fostering inclusivity and accessibility in various domains such as education, healthcare and public services,” he continued.

“By pushing the boundaries of innovation and creativity, this project aims to contribute to a world where communication is truly universal.”

Communication is truly universal

Sign up to our free newsletters