But during the presentation on Responsible AI, James Manyika, Senior VP of Technology and Society, introduced the latest silver bullet for dubbing: the Universal Translator.
Manyika described Universal Translator as “an experimental AI video dubbing service that helps experts translate a speaker’s voice while also matching their lip movements”— though he stopped short of naming which experts might participate, and in what capacity.
“We use next generation translation models to translate what the speaker’s saying, models to replicate the style and the tone, and then match the speaker’s lip movements, Manyika explained. “Then we bring it all together.”
Manyika demonstrated the tool by playing a clip of an online college course in the original English, followed by the same clip dubbed with Spanish audio, with the speaker’s lips moving in concert with the translated words.
To prevent possible misuse by bad actors (i.e., in creating deepfakes), Manyika said, the service was built with “guardrails,” will soon incorporate watermarks against misinformation, and is currently accessible only to “authorized partners.”
“I can understand why Google doesn’t want that product public, but like… that tech is going to be available everywhere anyway,” one tweet pointed out. “You can’t deny the accessibility boost though.”
Neither the First Nor the Last
Observers have already begun to speculate on obvious use cases, such as YouTube. (Time will tell whether popular YouTuber MrBeast’s own newly launched dubbing service will serve as a stopgap, if and until the Universal Translator is released to the public or competition.)
AI dubbing moves fast. The reveal comes just over six months after Google Research’s October 31, 2022 paper on “Textless Speech-to-Speech Translation.”
Beyond breathless headlines about the Universal Translator, specialized companies, such as AppTek, and Google’s competitors, including Amazon, have also been working on this challenge, known variously as automatic dubbing, machine dubbing, or AI dubbing. For now, the Universal Translator’s most novel contribution seems to be advance in “lip matching.”
Important Read: Here Are Six Practical Use Cases for the New Whisper API