How Big Tech Is Using Machine Translation as an AI Proxy

To demonstrate its latest technological advances, Microsoft could have presented any of several use cases—from image recognition to audio—but guess what? They chose machine translation.

On September 26, 2016, at the company’s annual Ignite conference, Microsoft engineer Doug Burger demonstrated how fast their technology could translate Leo Tolstoy’s 1,440-page masterpiece, War and Peace, from Russian into English. Burger’s machine translation demo clocked 2.5 seconds.

Burger used only four FPGA computers (a field-programmable gate array is an integrated circuit the user can configure post-manufacturing). Such a task would normally need 24 high-end CPU cores to achieve the same result (of course, Burger demonstrated this “less efficient” process as well).

And then he showed the audience what would happen if they threw most of Microsoft’s existing global deployment at another translation problem: all of Wikipedia’s English pages into another language—all five million articles and three billion words of it. This time, Burger clocked the demo at under a tenth of a second.

Where will Microsoft use this high-speed MT capability, which uses only a fifth of the power traditional CPUs do? At the same venue, Microsoft CEO Satya Nadella also announced plans to use their “high-powered hardware” in all of Microsoft’s servers going forward.

We can actually translate those five billion words (sic) if we threw our deployment at it in less time than it takes to blink once—Microsoft Engineer Doug Burger

The very same day Microsoft translated Wikipedia at Ignite—and Google made bold claims regarding its new neural-network-powered translation model—Baidu published an open source tool for measuring processor speeds in deep learning tasks, homing in on MT as a use case.

Called DeepBench, the deep learning yardstick can help the designer of a processor optimize it specifically for deep learning applications, such as machine translation.

By tracking performance across various hardware platforms, DeepBench can reduce errors in automatic language translation processors by 40%, according to Greg Diamos, a senior researcher at the Chinese Web giant’s Silicon Valley AI Lab.

Baidu also announced in September that it had released an open source software platform for the deep learning community called PaddlePaddle. First developed exclusively by and for the company’s own engineers, it has been applied to Baidu’s development efforts in, among other areas, optical character recognition and machine translation.

Perhaps, it is no surprise that Microsoft, Google, and Baidu have zeroed in on MT as a proxy for their tech prowess. Machine translation is a common application of what industry analyst Seth Grimes describes as being “at the core of just about everything Google and Baidu do and much of Facebook’s, IBM’s, Amazon’s, and Microsoft’s businesses,” and their funding; natural language processing (NLP).

There is very little about our lives that machine learning won’t touch—Nvidia CEO Jen-Hsun Huang

Even chipmaker Nvidia’s CEO Jen-Hsun Huang acknowledged in a recent keynote how his company was transitioning from a manufacturer of graphics cards to an AI company and that, in the next 10 years, “there is very little about our lives that machine learning won’t touch. Mostly it will be invisible and gradual: recommendation engines that get better over time; better machine translation…”

In Jump Samsung, Amazon

On October 4, 2016, the US Patent Office approved Samsung’s patent for a language translation method and a partner device to implement it. The Samsung patent claims a framework of commands that will be turned into a program (software) for identifying language via audio signal, and then translating it into a language through either audio or text message.

The resulting software will then be used to operate an electronic partner device—which the patent describes could be anything: a smartwatch, phone, refrigerator, washing machine, any Samsung product or appliance. The electronic device, on the other hand, will also have a language translation program memory!

The rationale behind the patent is apparently enabling real-time translation features for the company’s electronic devices. The patent falls short of naming the translation program Samsung intends to use.

Meanwhile, Amazon published a continuation to a patent application for its own translation methods. The updates were published on September 20, 2016, in the US Patent Office database. “Continuation” because the patent was filed back in September 2010 and, according to a disclaimer, “the claims of the present application are different and possibly, at least in some aspects, broader in scope than the claims pursued in the parent application.”

Simply put, the patent presents a method for translating content from one language to another to facilitate an e-commerce transaction. The method also ranks translations online based on the ratings history of users who submit translations and the users who rank them.

It is unclear what exactly the changes were to the existing patent application. Slator reached out to the inventors, but has yet to receive a reply.

Hazel Mae Pan contributed to this story.

Image: Nvidia headquarters / Shutterstock