Innovative System Translates Voices In Different Languages Simultaneously And Revolutionizes Multilingual Communication With Spatial Artificial Intelligence

Written by Noel Budeguer

Published on 14/05/2025 at 08:30

Sistema inovador traduz vozes em diferentes idiomas ao mesmo tempo e revoluciona a comunicação multilíngue com inteligência artificial espacial

Be the first to react!

Meet the Innovative Spatial Speech Translation, Which Allows Understanding Multiple People Speaking Different Languages at the Same Time with Accuracy and Sonic Realism

The new technology presented in Japan transforms meetings between different languages. MIT Technology Review revealed details about a model that combines artificial intelligence with spatial sound capture.

A recently presented development at the ACM CHI conference in Yokohama (Japan) promises to radically transform how people interact in multilingual environments. In light of the emergence of this new technology, MIT Technology Review shared more information.

This is the Spatial Speech Translation, a simultaneous translation system based on artificial intelligence that allows headphone users to identify and understand what multiple people are saying at the same time—even if they speak different languages.

ARTICLE CONTINUES BELOW

A System Against the Language Barrier in Groups

The goal of Spatial Speech Translation is to tackle one of the most complex challenges for automatic translation systems: the overlap of voices in group conversations.

With this technology, artificial intelligence is used to track both the spatial origin of the sound and the individual characteristics of each voice, allowing the user to accurately identify who is speaking and what is being said.

The proposal goes beyond a simple simultaneous translation. According to the technical description, the model divides the user’s acoustic environment into small regions and analyzes each one to detect potential speakers.

This recognition allows for generating a translated version of each voice that preserves essential elements such as sound direction, emotional tone, and original timbre—resulting in a more realistic auditory experience.

The Personal Dimension Behind the Project

The initiative has a deeply personal motivation for one of its creators, Professor Shyam Gollakota, a researcher at the University of Washington. In statements shared with MIT Technology Review, Gollakota explained: “We believe this system can be transformative.”

From a humanistic perspective, it is argued that technology should not only facilitate communication but also promote greater social inclusion for people facing language barriers.

More than solving specific cases, the project aims to reduce the anxiety and isolation that many feel when unable to fully participate in a conversation due to not mastering the language.

Artificial intelligence allows for reproducing the original direction and tone of various voices (Freepik)

Artificial Intelligence at Two Levels: How It Works

The system consists of two interdependent models. The first analyzes the sound space with a neural network that divides the environment into small zones. From this segmentation, it locates the exact direction from where the voices come.

The second model processes the detected voices, translates them into English from three languages—French, German, and Spanish—and reconstructs a version of the original voice, replicating elements such as tone, volume, and emotional cadence.

The innovative aspect is that this “cloned voice” maintains a high degree of naturalness. Instead of a robotic translation, the person using the headphones hears a synthesized version that emulates the original speaker’s voice, with a latency of just a few seconds. This feature enables a more fluid conversational dynamic than conventional systems.

Differences from Existing Technologies

Unlike other devices with automatic translation—such as Meta’s smart glasses—Spatial Speech Translation was developed to process multiple voices simultaneously. While most current systems focus on a single speaker, this proposal seeks to address the real issue of group conversations, with overlapping voices and languages.

Additionally, the technology uses accessible hardware: headphones with built-in microphones and laptops with Apple M2 chips, which enable the execution of the neural network models. This compatibility with commercially available technologies favors potential large-scale adoption.

Challenges and Next Steps

One of the main challenges faced by the team is to reduce the latency between speech and its translation. Currently, the delay is a few seconds, which affects the fluidity of the conversation. “We want to significantly reduce this latency to under one second, in order to keep the pace of the conversation,” Gollakota explained.

This goal presents complex technical challenges, as the syntactic structure of each language influences the speed of translation. For example, the system is faster at translating from French to English, followed by Spanish and then German.

According to researcher Claudio Fantinuoli from Johannes Gutenberg University Mainz, this is because German tends to place verbs—and, therefore, much of the meaning—at the end of sentences.

Several experts who did not participate in the development praised the advancement. Samuele Cornell, a researcher at the Carnegie Mellon Language Technologies Institute, highlighted that the project is technically impressive but warned that, for mass application, further training with real data and recordings in noisy environments will be necessary.

0 Comments

most recent

older Most voted

Innovative System Translates Voices In Different Languages Simultaneously And Revolutionizes Multilingual Communication With Spatial Artificial Intelligence

Meet the Innovative Spatial Speech Translation, Which Allows Understanding Multiple People Speaking Different Languages at the Same Time with Accuracy and Sonic Realism

A System Against the Language Barrier in Groups

The Personal Dimension Behind the Project

Artificial Intelligence at Two Levels: How It Works

Differences from Existing Technologies

Challenges and Next Steps

A giant printer called the “Ferrari” of concrete has landed in Latin America and can already build a 120-square-meter house in just 48 hours, making the traditional construction site look like something from the last century.

Brazil will harvest a record crop of up to 357 million tons of grains, but it does not have the capacity to store a large part of it. The storage deficit has reached the highest level in history and is equivalent to almost the entire production of Argentina.

Four times faster than the Concorde, capable of taking passengers from New York to London in less than an hour at Mach 5.5, and with an engine that switches from turbojet to ramjet mid-flight at Mach 3.5, the Hermeus Quarterhorse went from paper to its first flight in just 19 months.

Under nearly 2 kilometers of ice in Antarctica, scientists have discovered an ancient landscape sculpted by rivers up to 60 million years ago, the size of Wales, preserved as a time capsule that could help predict the advance of ice towards the ocean.

She started by collecting oyster shells from restaurants and today she has already accumulated more than 10 tons to restore reefs, recover coastal habitats, and demonstrate how a common waste can become a tool for ocean conservation.

SpaceX will dock a $843 million vehicle to the International Space Station to push a nearly 420-ton laboratory against the atmosphere and make the debris fall in a controlled manner into the ocean.

A 65,000-kilometer scar hidden at the bottom of the ocean reveals how the Earth continues to slowly open up as scientists observe the gigantic underwater ridge that has been reshaping continents for millions of years.

China has even created a degree in rare earths to train specialists in the minerals that power electric cars, turbines, airplanes, and military radars, while the West tries to catch up.

The INSS and the Ministry of Management set December 31, 2026, as the deadline for mandatory biometric registration, see who needs to do it.

A farmer left 5 heads of cattle on an island and ‘magic’ happened: the herd turned into 2,000 animals in 130 years, survived isolated in a volcanic territory of 55 km², and developed wild behavior revealed by genetics.

Hong Kong ‘swallows’ 2.45 billion liters of sewage per day in tunnels 163 meters deep, with giant pumps, compact tanks, and ships that transport 1,200 tons of sludge to be converted into energy.

Innovative System Translates Voices In Different Languages Simultaneously And Revolutionizes Multilingual Communication With Spatial Artificial Intelligence

Meet the Innovative Spatial Speech Translation, Which Allows Understanding Multiple People Speaking Different Languages at the Same Time with Accuracy and Sonic Realism

A System Against the Language Barrier in Groups

The Personal Dimension Behind the Project

Artificial Intelligence at Two Levels: How It Works

Differences from Existing Technologies

Challenges and Next Steps

A giant printer called the “Ferrari” of concrete has landed in Latin America and can already build a 120-square-meter house in just 48 hours, making the traditional construction site look like something from the last century.

Brazil will harvest a record crop of up to 357 million tons of grains, but it does not have the capacity to store a large part of it. The storage deficit has reached the highest level in history and is equivalent to almost the entire production of Argentina.

Four times faster than the Concorde, capable of taking passengers from New York to London in less than an hour at Mach 5.5, and with an engine that switches from turbojet to ramjet mid-flight at Mach 3.5, the Hermeus Quarterhorse went from paper to its first flight in just 19 months.

Under nearly 2 kilometers of ice in Antarctica, scientists have discovered an ancient landscape sculpted by rivers up to 60 million years ago, the size of Wales, preserved as a time capsule that could help predict the advance of ice towards the ocean.

She started by collecting oyster shells from restaurants and today she has already accumulated more than 10 tons to restore reefs, recover coastal habitats, and demonstrate how a common waste can become a tool for ocean conservation.

SpaceX will dock a $843 million vehicle to the International Space Station to push a nearly 420-ton laboratory against the atmosphere and make the debris fall in a controlled manner into the ocean.

A 65,000-kilometer scar hidden at the bottom of the ocean reveals how the Earth continues to slowly open up as scientists observe the gigantic underwater ridge that has been reshaping continents for millions of years.

China has even created a degree in rare earths to train specialists in the minerals that power electric cars, turbines, airplanes, and military radars, while the West tries to catch up.

A farmer left 5 heads of cattle on an island and ‘magic’ happened: the herd turned into 2,000 animals in 130 years, survived isolated in a volcanic territory of 55 km², and developed wild behavior revealed by genetics.

Hong Kong ‘swallows’ 2.45 billion liters of sewage per day in tunnels 163 meters deep, with giant pumps, compact tanks, and ships that transport 1,200 tons of sludge to be converted into energy.

While the Senate stalls the end of the 6×1 schedule, a supermarket takes the lead and delights its employees by adopting a 5×2 schedule with two days off, testing reduced wear and tear, and maintaining salaries even as the workweek is set to decrease from 44 to 40 hours.

A 65,000-kilometer scar hidden at the bottom of the ocean reveals how the Earth continues to slowly open up as scientists observe the gigantic underwater ridge that has been reshaping continents for millions of years.

China has even created a degree in rare earths to train specialists in the minerals that power electric cars, turbines, airplanes, and military radars, while the West tries to catch up.

The INSS and the Ministry of Management set December 31, 2026, as the deadline for mandatory biometric registration, see who needs to do it.

A farmer left 5 heads of cattle on an island and ‘magic’ happened: the herd turned into 2,000 animals in 130 years, survived isolated in a volcanic territory of 55 km², and developed wild behavior revealed by genetics.

Hong Kong ‘swallows’ 2.45 billion liters of sewage per day in tunnels 163 meters deep, with giant pumps, compact tanks, and ships that transport 1,200 tons of sludge to be converted into energy.