Text-to-speech (TTS)

Synthesising natural-sounding spoken audio from written text.

Text-to-speech (TTS) generates spoken audio from text. It is the final stage of a translation pipeline: once the translated text is ready, TTS voices it in the target language.

High-quality TTS, especially when combined with voice cloning, lets the translation be delivered in a voice that sounds like the original speaker rather than a generic robotic one.