Given the amount of research that Google conducts into the areas of machine translation (MT), speech-to-speech translation, and dubbing, the Silicon Valley tech giant was bound to productize a dubbing solution sooner or later. Not least because of the obvious application of machine dubbing to YouTube, which Google acquired in 2006.
Enter Aloud, the Google-grown dubbing solution that made its early-access debut in March 2022. Aloud’s main aim is to “let more people discover your videos” because, as it points out, “80% of the world does not speak English.”
With Aloud, content creators can dub their videos in multiple languages — currently Hindi, Indonesian, Portuguese and Spanish, with more languages to come soon. The product relies on audio separation, MT, and speech synthesis technologies in a process that Aloud claims generates “a quality dub in just a few minutes.”
To make use of Aloud, content creators need to supply their own video and original-language subtitles. Aloud then (machine) translates the subtitles and adds dubs to the video. If creators are unable to provide the subtitles, Aloud can auto-generate an editable text transcript.
Google-Grown, Future Known?
Spearheading the Aloud project are Buddihika Kottahachchi (Product Lead) and Sasakthi Abeysinghe (General Manager). Both are long-time Googlers, with Kottahachchi having worked for Google since 2014, and Abeysinghe for more than a decade.
Kottahachchi’s past Google projects include a customer support telephony platform based on machine learning (ML), while Abeysinghe previously led a team that provided internal operator tooling for the creation and curation of the Places dataset for Google Maps.
Development of Aloud began in April 2021 within Area 120, Google’s in-house incubator and experimental lab. Area 120 provides a safe space to ‘fail fast’ (as the Silicon Valley mantra goes), although preferably not. Rather, an incubator project will ideally fly the Area 120 coop to land effortlessly on one of Google’s existing platforms.
One prior example of an Area 120 success is Dialogflow, which became part of Google Cloud in May 2019 and is a comprehensive development platform for chatbots and voicebots. A more recent graduate is Rivet, which uses speech technology to enable children to practice reading independently and is set to become part of the Google Assistant family experience.
While Aloud’s exact future is far from certain, it has every chance of becoming integrated into a fully-fledged Google product, such as YouTube, which sees hundreds of hours of video uploaded every minute in different languages.
The sign-up form for Aloud’s early-access waitlist asks people to specify their YouTube channel URL (“to see which creators to work with, including looking at a channel’s performance”) as well as the email address linked to the YouTube account (“to verify that you own the channel and can upload videos to it”). It also requests your country of residence (“to prioritize where to launch next and to see if you’re eligible for our early access”).
Although global demand for dubbed content (hence, dubbing) is on the rise, dubbing remains a complex and costly undertaking. From Netflix — which released some eye-watering statistics around its dubbing volumes in January 2022 — to Amazon, which is conducting its own research into machine dubbing, the giants of the streaming landscape are showing a keen interest in streamlining dubbing operations and improving the dubbed experience.
Machine dubbing and similar solutions are also making waves in the startup world. In the past 18 months alone, machine dubbing Deepdub announced a USD 20m-raise (February 2022), multilingual video and avatar creation company Synthesia raised USD 50m (December 2021), and (partially) automated dubbing tool, Papercup, raised USD 11m (December 2020).