Google’s Neural Model Translates New Languages Without Prior Training

As Google deploys neural machine translation (NMT) to more languages in Google Translate (GT), its engineers face a challenge familiar to EU’s interpreter unit. Ideally, the EU would want to have interpreters available to translate directly from any of the 24 official languages into any other. But the resulting 576 combinations would, of course, be overwhelming. So the EU chose to use English as the relay language.

At Google, the numbers are, as expected, a little bigger. GT’s 100-plus languages mean the tech giant would have to build and train more than 10,000 individual models if each model only supports a single language pair. With NMT, that will not be necessary. In a paper published on November 14, 2016, a group of Google engineers around Mike Schuster and Quoc Le presented a method to translate between multiple languages using a single model.


That NMT models can be language-agnostic has already been demonstrated by researchers. The Google team said the new solution “requires no change in the model architecture from our base system”, is “significantly simpler than previous proposals,” and “improves the translation quality of all involved language pairs.”

The new model was based on the deep-learning framework TensorFlow and trained on the same NMT pipeline Google introduced in its original NMT paper.

Zero Shot

According to the paper, there are three main benefits to the new solution. The first is simplicity. The model dramatically reduces the number of models from the theoretical 10,000-plus, which would be “problematic in a production environment.”

Second, the model apparently improves translation quality for low-resource languages; that is, languages where little reference data is available. Third, and most interestingly, Google said the new solution allows translation between language pairs the model had never seen. The paper calls this “zero shot translations.”

The researchers give the example of an NMT model trained on Portuguese into English and English into Spanish that generates “reasonable” translations for Portuguese into Spanish “although it has not seen any data for that language pair.” The model’s ability for zero shots actually came as a surprise to the researchers.

Google also claimed it is a world’s first, saying, “to our knowledge this is the first demonstration of true multilingual zero-shot translation.”

Apparently, there is also a speed advantage to zero shots: “Besides the pleasant fact that zero-shot translation works at all it has also the advantage of halving decoding speed.”

Code Switch

Another benefit of the new solution is that it allows intra-sentence code-switching—producing a translation into English from a sentence written partially in, for example, Korean and Japanese. One potential application of this may be the e-discovery process, where large data sets of data containing multiple languages need to be translated.

For all its quality improvement claims, Google used the BLEU yardstick we recently analyzed.

Naturally, there is still a ways to go. It seems Google has rolled out NMT for other languages now, such as English into German, although Slator has not verified this. In a random testing of GT’s performance, there appears to be an increase in fluency (one of NMT’s much touted benefits), but also an inclination for omissions (missing verb) and crass mistranslations (to “whiff” becomes to “piss”—perhaps, because the actual colloquialism is “to take a whizz,” meaning to urinate).

Update: In a blog post published hours after this article was posted Google confirmed it has rolled out GNMT for total of eight languages (including the above mentioned English into German).

Florian Faes

Co-Founder of Slator. Linguist, business developer, and mountain runner. Based in the Shire, aka Zurich, Switzerland.