New Translation Technology Uses Artificial Intelligence to Translate, in Real Time, the Speeches of Multiple People Simultaneously

Written by Fabio Lucas Carvalho

Published on 09/05/2025 at 22:12

Foto: Reprodução

Be the first to react!

New Headset System With Artificial Intelligence Identifies Group Voices, Translates In Real-Time And Simulates The Tone Of Each Person, Offering A More Natural And Fluid Experience In Different Languages.

Soon, it will be possible to communicate with people speaking various different languages without learning their language. This is the goal of a new headset system with artificial intelligence.

Called Spatial Speech Translation, it translates speech from multiple speakers in real-time, based on the direction of the voice and the unique characteristics of each speaker.

Technology To Break Linguistic Barriers

The project was developed by researchers at the University of Washington in the United States.

ARTICLE CONTINUES BELOW

How The System Works

The Spatial Speech Translation uses two models of artificial intelligence. The first divides the space around the user into small areas and locates sound sources based on neural networks.

The second model translates speeches from languages such as French, German, and Spanish into English, as well as simulating the tone and style of voice of each speaker.

This allows the translated sound to appear to come from the same direction as the original speaker and with a voice very similar to theirs, rather than a generic machine sound. The technology uses public databases to perform translations and voice simulations.

Samuele Cornell, a researcher at Carnegie Mellon University, highlights the complexity of the task. “Separating human voices is already difficult for AI systems. Doing this in real-time and with low latency is impressive,” he states. Although he did not participate in the project, he considers the initial results to be quite promising.

Challenges Still Persist

Even with the advancements, the system still faces challenges. The main one is the response time between speech and translation. Currently, there is a slight delay, and Gollakota’s team wants to reduce this time to less than one second.

“The goal is to maintain the fluency of conversation between people speaking different languages,” explains the researcher. However, this reduction in time may affect the accuracy of the translation, according to experts.

This is because the more context the AI has, the better the translation. Less time may mean lower quality.

The speed also varies by language. The translation from French to English is faster. Spanish follows, and German is the slowest among the three. This is due to the structure of the sentences. In German, for instance, the verb often comes at the end, which delays the interpretation of the message.

A Promising Application

For Alina Karakanta, a professor at Leiden University in the Netherlands and an expert in computational linguistics, the system has great potential. She did not participate in the study but believes it can have a positive impact. “It’s a useful application. It can help people,” she states.

Real-time translation is still an evolving field. More advanced language models have significantly improved results in recent years.

In applications like Google Translate or tools like ChatGPT, languages with abundant data are already translated with excellent quality. However, it is still not something entirely instantaneous.

The system now presented takes a step further. It combines spatial localization, voice identification, and simultaneous translation, all with a more natural and personalized sound.

The Future Of Communication Without Barriers

The project shows a promising path for the use of artificial intelligence in human interactions. The possibility of understanding several people speaking different languages at the same time can transform international meetings, family gatherings, and everyday situations in multilingual environments.

But, as researcher Claudio Fantinuoli from Johannes Gutenberg University in Germany reminds us, there are still technical limitations to overcome. “It is necessary to balance speed and accuracy. Waiting longer provides more context but reduces fluency,” he explains.

The team continues to work to improve the system. If they can reduce the response time while maintaining the quality of translation, the Spatial Speech Translation could become an essential tool for breaking language barriers worldwide.

New Translation Technology Uses Artificial Intelligence to Translate, in Real Time, the Speeches of Multiple People Simultaneously