2018 was the busiest year yet for neural machine translation (NMT) research. With NMT becoming productized and integrated across the language industry and beyond, it has become the norm for big tech companies to publish their latest research findings.
One way to keep tabs on NMT’s relentless progress is to monitor research submitted to the Cornell University portal arXiv.org. Since Slator began tracking NMT publications on arXiv in late 2016, output has accelerated dramatically.
Submissions marked a record high in April 2018, only to climb again in May 2018 and again in August 2018, the busiest month so far. Based on Slator’s running count of NMT-focused or related research papers in arXiv, last updated March 15, 2019, research activity has more than doubled in 2018 compared to 2017.
In 2017, there were a total of 199 papers submitted to arXiv. In 2018, there were 430. Research output has consistently increased since 2014, growing at close to a 180% compound annual growth rate over five years.
Direction of Research Trends
A number of research directions have been a staple since NMT began to appear on arXiv. Research on improving NMT output and the processes used by NMT systems, for example, are almost always present. Some research directions only recently gained steam, such as low-resource languages or languages where little training data is available “in the wild.”
During the last quarter of 2018, the top research directions had to do with figuring out the inner workings of NMT models, new or improved models, and low-resource languages.
Some researchers looked into identifying important neurons within NMT models and how to control them, while others sought to find out how current methodologies like tokenization affect NMT output.
As for new or improved NMT models, there have been experiments to add so-called twin-gated recurrent networks, replace modules in Google’s popular Transformer architecture, and develop new high-quality personalized models (research from Lilt).
Chinese tech giant Tencent researched two new approaches to NMT during the last quarter of 2018: a system that incorporates adequacy-oriented learning and a model called DTMT (Deep Transition Machine Translation).
And now that NMT has become better at tackling low-resource languages, many others have followed focusing on this challenge. NAIST Japan, for instance, experimented on augmenting incomplete training data with multiple language sources.
Chinese e-commerce giant Alibaba, on the other hand, approached low-resource NMT differently. They trained an NMT system with image descriptions in multiple languages, hypothesizing that “[descriptions] of the same visual content by different languages should be approximately similar.” Microsoft wanted to look into what is called transfer learning to see how much it can help alleviate low-resource situations.
Meanwhile, research activity noticeably slowed toward December and has yet to recover. The same trend was observed in previous years and repeated in December 2018 when research output was around a fifth of what it was in November 2018; and research activity continued to dawdle with not much output from January to February 2019.
This is also likely influenced by the schedules of major conferences in which researchers present their papers. Some of these include the Empirical Methods in Natural Language Processing 2018, which took place from October 31 to November 4, 2018 and Applied Machine Learning Days 2019, which was held from January 26 to 29, 2019.
Corporates Going All in on NMT
2018 also saw a spike in corporate involvement in NMT research (as opposed to academia) as companies continue to not only engage in R&D but also push into areas like customizable translation and even service clientele whose use cases would usually fit LSPs.
During the record month of August 2018 (when nearly 60 research papers were submitted to arXiv), Google, Microsoft, and Baidu, among other notable companies, were obviously very busy.
From January 1, 2018 through March 15, 2019, the companies below were involved in research papers submitted to arXiv. Note that not all papers these companies were involved in had to do strictly with NMT, although all were related to the technology.
As a caveat, arXiv is an open platform and Slator’s data set is limited to a certain category and certain keywords. This means some false positives and re-submissions will impact statistics and research papers accessible on arXiv will depend on the date the platform was accessed.
Finally, a recent development out of one of the biggest companies on our list happened just a few days ago. On March 19, 2019, Google published a new research paper on their latest progress in zero shot translation. “Zero shot” is an NMT model that translates between languages not present during the training.
According to the paper, entitled “The Missing Ingredient in Zero-Shot Neural Machine Translation,” from researchers at Google AI, they have successfully made the output of zero shot NMT as good as unsupervised NMT that uses pivot languages. NMT engines use pivot languages similar to their target languages to make up for any sparsity in training data.