In Research, Neural Steamrolls Statistical Machine Translation

Neural Machine Translation is the language industry’s 2016 buzzword No. 1. Some say it is groundbreaking, others say, overrated. But one thing is certain. Research into NMT has taken off over last 12 months.

Since the first paper with the words “neural machine translation” in the title appeared on the Cornell Unversity site (pronounced “archive”) in 2014, a total of 81 have been published to date. “Statistical machine translation” (SMT) numbered a meager 33 over the same period.


Much more telling, 63 of those NMT-titled papers were filed in 2016 alone, compared to just 11 for SMT. So, at least, as far as research output is concerned, NMT has won the battle, if not the war.

The Cornell repository is an automated online distribution system for research papers (so-called e-prints). It offers an alternative to traditional peer-reviewed journals or online platforms like Frontiers by being a “pure dissemination system,” as Paul Ginsparg describes it. Ginsparg is the Harvard and Cornell physicist whose brainchild arXiv is.

In 2015, arXiv received an average of 8,773 submissions per month, totalling 105,280 research papers at year-end. The site topped 10,000 submissions for the month of October 2016, a first since it launched in 1991.

According to peer-review journal publisher Frontiers, 2.5 million studies are published in one of the 30,000 scholarly journals each year. As far as microcosms go, may be a better gauge with over 1.2 million papers published compared to, which has only published 30,000 studies since it launched in 2007.

The most published NMT author on is trailblazer Kyunghyun Cho (14 citations for NMT-titled papers) followed by Yoshua Bengio (9), a Université de Montréal Professor who recently launched a deep learning incubator. We featured Cho, an Assistant Professor at New York University, in our story on simultaneous machine translation. At least one other scientist calls Cho an NMT pioneer and his 2014 paper a milestone in NMT research.

The most cited authors for SMT-titled papers are Krzysztof Wołk (4) and Krzysztof Marasek (3), who have often worked together. Their most recent joint study was published in Polish back in March 2016 and has to do with parallel data extraction from comparable corpora to enhance multi-domain machine translation on, say, Wikipedia and

arXiv provides instant pre-review dissemination…a breadth far beyond the capacity of any one journal—Paul Ginsparg, Cornell University

A recent submission on machine translation, published December 12, 2016, is interesting because it is, to borrow the words of the authors, “a novel scheme to combine neural machine translation (NMT) with traditional statistical machine translation (SMT).” The study is attributed to scientists from the University of Cambridge Department of Engineering and SDL Research.

Another paper, published in October 2016, was the Oxford Master Thesis of Pinterest Software Engineer Paul Baltescu. (According to his LinkedIn profile, Baltescu’s Master’s at Oxford coincided with his internships at Twitter and Quora.) In it, Baltescu investigates “alternatives for the two components which prevent standard translation systems from working on mobile devices due to high memory usage.”

Baltescu explains that when he replaced the components with proposed alternatives, he was able to come up with “a scalable translation system that can work on a device with limited memory.”

Yet another study, published November 2016, bears the name of Huawei and Tsinghua University, Beijing. Its authors point out that NMT “suffers from a major drawback”: frequently inadequate translations. Their proposed framework “alleviates” NMT’s tendency to repeat the translation of some source words while wrongly ignoring others.

The authors say experiments show their approach “significantly improves the adequacy of NMT output and achieves superior translation result over state-of-the-art NMT and statistical MT systems.”

Authors Zhaopeng Tu and Lifeng Shang are Researchers at Noah’s Ark Lab and Yang Liu is the Chinese tech giant’s Supply Chain Planner; while Microsoft vet Xiaohua Liu is from Tsinghua and fellow alumnus Hang Li has worked at Hulu from internship to Senior Software Developer.

Among those who have authored papers published on the Cornell website are recognizable names from tech (such as Google’s Mike Schuster), the language industry (Systran CTO Jean Senellart), and members of the academe Slator has featured before, such as Marcin Junczys-Dowmunt, Graham Neubig, Jason Lee, and Rico Sennrich (whose latest NMT paper was published on arXiv on December 14, 2016), among others.

Marion Marking

Communications specialist, veteran journalist, and online editor at Slator who dreams of driving a Veyron on the Autobahn