Google continues to aggressively push the adoption of Tensorflow, a programming platform that Google and third-party developers use for advanced machine learning models (including Google Translate).
In a post on Google’s Research Blog, the company calls machine translation (MT) “one of the most active research areas in the machine learning community” but points to “a lack of material that teaches people both the knowledge and the skills to easily build high-quality translation systems.”
And so, Google announced it wants to help developers “build a competitive translation model from scratch” and posted a new neural machine translation tutorial for Tensorflow on Github.
According to the post’s authors, “the tutorial provides details on how to replicate key features in Google’s NMT (GNMT) system to train on multiple GPUs.” GPUs are graphical processing units, the fast computer chips used to power NMT.
Google hopes the tutorial will “spur the creation of, and experimentation with, many new NMT models by the research community,” i.e, faster adoption of Tensorflow for NMT development.
It Will Take More Than a Simple Tutorial
John Tinsley, CEO of Dublin-based machine translation provider Iconic Translation Machines, told Slator that the tutorial’s “biggest impact is that it will guide people to play around with training engines more easily with Tensorflow, and lower the bar to getting started with Tensorflow itself.”
Tinsley continued, “I see this as a natural progression in the R&D landscape, similar to Moses for SMT where there is an abundance of information and tutorials on how to build engines, with tips on things to try and points to best practice.”
“I see this as a natural progression in the R&D landscape” — John Tinsley, CEO, Iconic Translation Machines
Andrew Rufener, CEO of Omniscien Technologies, a machine translation provider based in Bangkok and Singapore, concurred with Tinsley in that the tutorial will likely help broaden the NMT developer community. “This is an important announcement and having it as part of the Tensorflow framework makes machine translation more accessible from a technology perspective to many people in combination with machine learning, which is a good thing,” Rufener said.
However, Tinsley and Rufener both pointed out that having access to the framework is just the start. Tinsley takes issue with Google’s claim that the tutorial provides “both the knowledge and the skills to easily build high-quality translation systems.” “It will take more than a simple tutorial to gain that knowledge!” he said.
Furthermore, Rufener stressed that “there are a range of different NMT decoders and frameworks on the public domain that you can use; access to the software as such is no longer an issue.”
“Access to the software as such is no longer an issue” — Andrew Rufener, CEO, Omniscien Technologies
What both Tinsley and Rufener see as the real challenge in NMT is not the framework but the preparation and curation of suitable data. “The problem always has been and remains data and pre-processing,” Rufener said. Tinsley concurs: “most importantly with neural machine translation, particularly for industry, [the challenge is] how to handle, prepare and deal with training data of different shapes and sizes.”
Rufener cautioned, however, that “unless you have the pre- and post-processing and the training data in good quality and with the right formatting, this framework doesn’t help with a total solution.”
Tensorflow as Key to Cloud Success
Google’s announcement is part of the company’s broader effort to accelerate the adoption of Tensorflow. According to an article published in the MIT Technology Review on June 27, 2017, Tensorflow’s eventual success underpins Google’s ambitions in the cloud business.
A few months after Google launched Tensorflow in early 2015, the search giant decided to open source the software. The decision paid off and, according to the MIT article, Tensorflow “is becoming the clear leader among programmers building new things with machine learning.”
Since Tensorflow works best on Google’s cloud platform, its adoption helps the company play catch up in the cloud infrastructure market, where, according to the MIT article, Google “lies a distant third behind Amazon and Microsoft.”
For the broader machine translation industry, the key question is, ultimately, who – except Google Translation, of course – is going develop and launch the first large-scale commercial MT deployments on Tensorflow.
Image: Google I/O developer conference, San Francisco (May 17–19, 2017)