Neural Machine Translation Improving Fast, Study Finds

A study published on August 16, 2016 claims that Neural Machine Translation (NMT) outperforms phrase-based MT (PBMT) and provides better translations in the “particularly hard” to translate English-German language pair.

In the past, the researchers say, NMT was considered “too computationally costly and resource demanding” to compete with PBMT. Well, NMT literally need(ed) a lot of electricity. However, this has apparently changed beginning 2015, and NMT is now becoming more competitive.

The researchers (Luisa Bentivogli, Mauro Cettolo, and Marcello Federico of Fondazione Bruno Kessler, Trento Italy; Arianna Bisazza of the University of Amsterdam) found that, architecturally speaking, NMT is simpler than traditional statistical MT systems. Interestingly enough, however, they also add that the process is “less transparent” with NMT, saying that “the translation process is totally opaque to the analysis.” How NMT does what it does still seems a bit of a black box.

For the study, the researchers built on evaluation data from the IWSLT 2015 (International Workshop on Spoken Language Translation) MT English-German task and compared results using what they call the “first four top-ranking systems”; that is, NMT and three other phrase-based MT approaches.

Translate TED

The researchers sourced translation material from TED talks (transcripts translated from English into German), reasoning that the language used is structurally less complex, more conversational than formal, and required “a lower amount of rephrasing and reordering.”

As to why English and German, the researchers said using the two languages would be interesting because, despite belonging to the same language family, “they have marked differences in levels of inflection, morphological variation, and word order, especially long-range reordering of verbs.”

“The outcomes of the analysis confirm that NMT has significantly pushed ahead the state of the art”—Bentivogli, Cettolo, Federico, Bisazza

And it is in this aspect of better word reordering, particularly in the case of proper verb placement, that NMT shines. To quote, “one of the major strengths of the NMT approach is its ability to place German words in the right position even when this requires considerable reordering.”

Those Misplaced German Verbs

In contrast, the study indicated that “verbs are by far the most often misplaced word category in all PBMT systems,” which the researchers pointed out as a common problem affecting standard phrase-based statistical MT.

In summary, the outcome of the study’s analysis confirmed that NMT reduced the overall effort by a post-editor by 26% compared to PBMT output. In addition, NMT produced 70% less verb placement errors, 50% less word order errors, 19% less morphological errors, and 17% less lexical errors.

“Machine translation is definitely not a solved problem”—Bentivogli, Cettolo, Federico, Bisazza

However, despite outperforming PBMT systems on all sentence lengths, the performance of NMT degraded faster than its competitors the longer the input sentence became, which was one aspect the researchers singled out as an area for future work on improving NMT.

The researchers’ sense of excitement is palpable when they write “machine translation is definitely not a solved problem, but the time is finally ripe to tackle its most intricate aspects.”