DeepL expands beyond text, introduces real-time voice translation
DeepL is expanding its translation technology to voice, aiming to deliver real-time speech translation with high accuracy and natural language quality.
DeepL, widely recognised for its text translation tools, has introduced a new voice-to-voice translation suite that expands its capabilities beyond written language. The newly launched system is designed to handle real-time conversations across multiple environments, including business meetings, mobile and web interactions, and group discussions for frontline workers through customised applications.
Alongside the new voice translation tools, DeepL is also rolling out an API that allows developers and enterprises to build tailored solutions on top of its technology. This could enable use cases such as multilingual customer support systems in call centres or specialised communication tools for global teams.
Speaking in an interview with TechCrunch, DeepL CEO Jarek Kutylowski explained that expanding into voice translation was a natural progression for the company. “After spending so many years in text translation, voice was a natural step for us,” he said. “We have come a long way when it comes to text translation and document translation. But we thought there wasn’t a great product for real-time voice translation.”
One of the primary challenges in developing real-time voice translation, according to Kutylowski, lies in balancing speed and accuracy. The system must minimise latency — the delay between spoken input and translated output — while still delivering reliable translations.
DeepL is integrating its new capabilities into widely used collaboration platforms such as Zoom and Microsoft Teams. Within these environments, users can either listen to translated audio in real time while participants speak in their native languages or follow along with translated text displayed on screen. These integrations are currently available in early access, with organisations invited to join a waitlist.
The company is also offering tools for both mobile and web-based conversations, supporting interactions that can take place either in person or remotely. For group scenarios such as training sessions or workshops, participants can join shared conversations using a QR code, enabling real-time multilingual communication across multiple users.
DeepL’s voice system is designed to adapt to specialised vocabulary, including industry-specific terminology, company names, and personal names. This customisation is intended to improve translation accuracy in professional settings.
Kutylowski highlighted the growing role of AI in reshaping customer service, noting that translation tools can help organisations deliver support in languages where skilled staff may be limited or costly to hire. By adding a translation layer, companies can extend their reach without needing to build large multilingual teams.
Currently, DeepL’s system processes speech by converting it into text, translating that text, and then converting it back into spoken audio. While this approach builds on the company’s strength in text translation, DeepL aims to develop a future end-to-end model that can translate speech directly, without relying on an intermediate text step.
The company is entering a competitive landscape with several well-funded players working on similar technologies. Sanas focuses on modifying accents in real time, particularly for call centre use cases. Camb.AI specialises in speech synthesis and translation for the media and entertainment industry, enabling large-scale dubbing and localisation. Meanwhile, Palabra, backed by Alexis Ohanian’s firm Seven Seven Six, is developing a real-time translation engine that aims to preserve both meaning and the speaker’s original voice.
With this launch, DeepL is positioning itself to compete more directly in the evolving space of real-time voice communication, building on its established reputation in text-based translation while pushing toward more advanced speech technologies.
What's Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0