Google Translate is top of mind at language service providers. Yes, there’s Microsoft Translator or China’s Baidu. But even in Asia, few translation clients would complain about a translation by saying, “Hey, this looks like it has been translated by Baidu,” The reference is almost always Google.
The stats are staggering. According to Google, the software now supports over 100 languages, has over half a billion monthly users, and translates 100 billion words every day.
As the product went live 10 years ago, the buzz centered around the shift from rules-based to statistical machine translation. Google’s research scientist Franz Och (now Chief Data Scientist at a biotech company) said back then that rules-based models “require a lot of work by linguists to define vocabularies and grammars,” while Google had achieved “very good results in research evaluations” by using a statistical model.
Ten years on and rules-based models are dead. Statistical models rule the roost. Google, meanwhile, has moved onto the next big thing, promising “better translations compared to current statistical machine translation engines” by deploying deep learning.
To fine tune the engine, Google is using good, old fashioned human brains to improve translations courtesy of volunteers in its Translate Community. According to a Google blog post from 2015, “almost 50% of the most common phrases typed in Google Translate come from translations provided by the Translate Community, which is maintained by Program Manager Niu Mengmeng.
Other key people at Google Translate are Barak Turovsky, who’s Google Translate Head of Product and Design, Svetlana Kelman, a Senior Program Manager, Pendar Yousefi, who is the UX Lead, and Baris Yuksel, a Senior Software Engineer and Tech Lead.
Google dominates but is not alone. According to Seth Grimes, an expert on natural language processing (NLP), text and sentiment analysis, NLP―and by extension machine translation―is “at the core of just about everything Google and Baidu do and much of Facebook’s, IBM’s, Amazon’s, and Microsoft’s businesses.
“A key driver behind all of this work has been our long-term investment in machine learning and AI. It’s what allows you…to translate the web from one language to another.” – Google CEO Sundar Pichai
At Google, however, translation is at the core of the company’s mission. In a 2,000 word annual founder’s letter on April 28, 2016, Google CEO Sundar Pichai mentions translation several times: “A key driver behind all of this work has been our long-term investment in machine learning and AI. It’s what allows you…to translate the web from one language to another.”
And now for the obvious and much debated question. Will Google Translate kill the traditional language services industry? Probably not in the next 10 years. Google will not enter the services business (scaling translation services is cumbersome). And it is unlikely they will launch an enterprise version of its offline version; which might make it an option for companies that do not trust Google with their data. (Try accessing Google Translate at a major investment bank or accounting firm)
What is really happening is that Google Translate has capped what would have been explosive growth in the language services sector. A meager 1% of those 100 billion words Google translates every day would be the equivalent of over USD 70bn worth of business per year at current rates.
In the enterprise market, the Google, Microsoft, Baidu API’s and their more bespoke MT brethren will further eat into the pie that would have otherwise gone to service companies. But LSPs should not worry. The pie itself is growing fast, and demand for translation is very elastic. Who wouldn’t want their website in 100 languages if it were cheap enough to do so?
And for those linguists and providers who worry about the next 10 years of the Googlebot crunching the language problem, here’s something to ponder. If machine translations become so good in all fields as to be indistinguishable from an expert human translation, true AI has arrived―and we will have bigger things to worry about than language translation.