Google, Microsoft, Baidu Step Up NMT Research in Record August as BLEU Takes Heat

Neural machine translation (NMT) research seemed to slow down marginally in July 2018 after it ballooned for the first half of the year, based on research papers research submitted to August 2018, however, picked up the slack and actually surpassed May as the busiest month on record so far.

Research published on the Arxiv platform that mentioned NMT in the title or abstract reached a record 57 papers last month, slightly up compared to May’s 55. Of course, there is the caveat that some of the search results on Arxiv are false positives and others are updated resubmissions of papers that already appeared earlier. Taking these into account, 33 of all the search results were strictly about NMT and were new submissions.

Notably, many of the papers submitted to Arxiv in the past couple of months will also be presented on the Third Conference on Machine Translation (WMT 2018) to be held on October 31 to November 1, 2018 in Brussels, Belgium.

BLEU Under Fire Again

Bilingual evaluation understudy (BLEU) is the current method of evaluating NMT output, but that may soon change with how many researchers are advocating for a newer, better standard. In one of Slator’s most recent coverage of NMT research, authors Samuel Läubli along with renowned researchers Dr. Rico Sennrich and Dr. Martin Volk found that the BLEU method is inadvertently becoming part of a bigger problem.

NMT output has become so fluent that BLEU along with current research community standards are no longer enough. It is time for document-level as opposed to sentence-level evaluation, they argued.

This was the same conclusion by another paper by Dr. Antonio Toral, Dr. Sheila Castilho, Ke Hu, and Dr. Andy Way, which Toral and Way graciously provided Slator directly. Like the paper by Läubli, Sennrich, and Volk, this one saw the same flaws in the current standard of measuring output fluency that necessitated change in the way NMT research in particular is measured. The research community’s current evaluation standards can no longer accurately reflect the progress of NMT.

The limitations of BLEU for NMT research have been a sticking point since NMT research started to ramp in the past couple of years, and many experts Slator talked to including in the NMT report 2018, are actively advocating, looking for, and suggesting alternatives.

Google, Microsoft, Baidu Very Active this August

While big names are known to contribute to research from time to time, last August, Google, Microsoft, and Baidu were noticeably active, at least in terms of new research papers submitted.

Google submitted six research papers in August 2018, most of them intent on digging deeper into how NMT processes or output can be improved. Google researchers introduced SentencePiece, a tool that tokenizes (and de-tokenizes) raw sentence input for NMT into subwords, which are much easier to handle for NMT engines. They also introduced what they called SwitchOut, a data augmentation algorithm that ultimately improves the NMT process while retaining quality.

Google researchers revisited character-based NMT and how the order of tokens generated by an NMT system affects its output, as well as new, tree-based decoders that add syntactic info to NMT models and back-translation for low resource languages (in a research paper where Facebook is also involved).

Microsoft also went after deeper issues across four papers submitted. Their researchers used optimizers to prevent fine-tuning issues within the NMT model, resulting in a boost in processing speed, and studied the potential of reinforcement learning when applied to NMT. They also worked on style transfer and improving NMT output by likewise improving bi-directional translation.

Meanwhile, Chinese tech giant Baidu looked at the limitations of beam search, one of NMT’s components, as well as adding multiple references during neural network training and how to generate pseudo-references due to leverage this method. Baidu researchers submitted three papers to Arxiv last August 2018.

Notably, another Chinese giant was present that month: Alibaba, whose researchers used an improved model called semi-autoregressive transformers (SAT) to attain a nearly six times processing speed boost for around 90% the same output quality.

Interestingly, the research directions of papers submitted by these big tech brands reflected the research topics of most of the papers in Arxiv last August. Researchers appeared keen on figuring out the inner workings of NMT models to improve output and process speed, and some others looked at low-resource languages and, as mentioned earlier, the need for document-level context in evaluation.

Download the Slator 2019 Neural Machine Translation Report for the latest insights on the state-of-the art in neural machine translation and its deployment.