logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • SlatorCon Remote May 2021
    • Localizing at Scale for International Growth
    • Design Thinking May 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • SlatorCon Remote May 2021
    • Localizing at Scale for International Growth
    • Design Thinking May 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Register Before April 15th for SlatorCon Remote and Save 15%!

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
Neural Machine Translation Research Output Ballooned in the First Half of 2018

3 years ago

August 8, 2018

Neural Machine Translation Research Output Ballooned in the First Half of 2018

Academia ·

by Gino Diño

On August 8, 2018

3 years ago
Academia ·

by Gino Diño

On August 8, 2018

Neural Machine Translation Research Output Ballooned in the First Half of 2018

2017 will be remembered as the year neural machine translation (NMT) went mainstream. That does not mean, however, that it is a “solved problem”. Far from it, of course, as anyone fluent in two languages using even the most advanced online machine translation portals can attest to.

However, there are hundreds if not thousands of researchers working on the problem. Halfway through 2018, NMT research jumped 115% compared to the same period last year by one measure. From January to June 2017, Slator found 91 research papers relating to NMT (with the keyword “neural machine translation” in either the title or the abstract) on Cornell University’s automated online research distribution system, Arxiv.org. In the same period this year, that number skyrocketed to 196.

As we cautioned previously, there are some false positives and instances where NMT is just mentioned as an active research area or used for experimentation to test a hypothesis relevant to a bigger domain, such as natural language processing or even machine learning and deep learning in general.

Advertisement

There is also the issue of resubmissions, where the first version of a research paper published earlier is updated with new information or corrections. While not unique papers in their own right, these still count towards research activity in the domain.

Slight Slowdown

After a frantic spring 2018 with dozens of papers published by some of the world’s largest technology companies, submission activity has slowed somewhat in July 2018 compared to previous months.

Only 26 research papers were submitted in July, and only nine of them are directly related to NMT and are not updated versions of previous submissions.

More and more research papers are mentioning NMT as a benchmark for state-of-the-art neural network technology.

This is a good sign for NMT researchers, but this also means that there is an increasing number of false positives when searching the Arxiv database. In addition, as researchers update their papers, the number of resubmitted, updated versions of previously publications are also increasing.

Evolving Research Directions

As time has gone by, general research topics have evolved since NMT hit the mainstream. Research on Arxiv between November 1, 2017 and February 14, 2018 focused on just a couple of main topics, namely improving quality of output and addressing training data constraints (e.g. low-resource languages).

Looking at which companies were involved in which papers between February 15, 2018 and the end of April 2018, it appears that the major players were taking totally separate research directions and working on research topics of their own.

The Facebook AI Research (FAIR) team, for instance, was busy tackling the problem of low-resource languages, a practical challenge for Facebook, which reached the two billion user mark in 2017 and requires 4.5 billion translations a day.

Amazon, meanwhile, was pursuing better operational efficiency, which makes sense as their offering is geared towards enterprise users of their cloud platform as well as LSPs, who could benefit from improved NMT processes and speed.

One paper Amazon worked on was on “constrained decoding,” a method that allows NMT to consistently translate specific words or terminology. The problem is that for every word that the NMT engine needs to remember in order to translate a specific way, the entire system slows down a bit.

Google, on the other hand, seemed focused on improving NMT output, though the search giant has a finger in pretty much every pie, as usual. Google Brain researchers co-authored publications with Microsoft on low-resource languages, machine reading and question-answering, and unsupervised learning.

Google even came up with improved models that were essentially hybrids of existing NMT engines. These hybrids reportedly outperformed state-of-the-art, according to Google, including their own in-production Google Translate transformer model.

Still Emerging, Already Impacting the Industry

NMT is still emerging and research, spearheaded by academia and helped along by the corporate side, is proceeding at a healthy pace. Indeed, the first half of 2018 showed just how active the research community is, with May 2018 dethroning April as the busiest month for NMT.

The race for NMT also spills over to open source. Jean Sellenart, Global CTO of Systran, remarked during SlatorCon London 2018 that over “the last two years there has been, every month, about two new open source projects for NMT.”

There is definitely a type of snowball effect in play. The technology offers such breadth and depth that even competing companies sometimes work on the same research together. “No company in the world can reproduce 250 papers just to check if they’re right or wrong,” Senellart said. “It is one of the reasons of the necessity of open source today.”

More and More Familiar Names

Come May, June, and July 2018, more familiar names found their way into Arxiv research papers. The usual players like Google, Microsoft, and Amazon were of course present, as well language industry names like Systran, Ubiqus, and SDL.

China was well represented with e-commerce giant Alibaba and internet company Tencent both publishing papers—even Sogou published a paper, though not specifically about NMT.

More recently, in July 2018, Tencent went straight to production with an experimental approach to detect issues in NMT translation without relying on reference translations. The BLEU (bilingual evaluation understudy) metric uses similar reference translations to score MT output, but has been coming under fire recently for not being adequate for NMT.

“Our experimental results show that our new approach could achieve high effectiveness on real-world datasets,” Tencent’s abstract read. “Our successful experience on deploying the proposed algorithms in both the development and production environments of WeChat, a messenger app with over one billion of monthly active users, helps eliminate numerous defects of our NMT model, monitor the effectiveness on real-world translation tasks, and collect in-house test cases, producing high industry impact.”

The race for better NMT output also accelerated, with research on better accuracy and adequacy, improved operational efficiency, and document-level context. Low-resource languages also emerged as a priority for many researchers, and teams in Japan (more prominently NICT and NAIST) and China picking up the pace.

In the business world, the impact of much higher quality machine translation is fast becoming felt throughout the supply chain and it is already impacting unit price expectations.

Download the Slator 2019 Neural Machine Translation Report for the latest insights on the state-of-the art in neural machine translation and its deployment.

Slator 2019 Neural Machine Translation Report: Deploying NMT in Operations

Data and Research
32 pages, NMT state-of-the-art, 5 case studies, 30 commentaries, NMT in day-to-day operations
$85 BUY NOW

TAGS

AlibabaAmazonFacebookGoogleJean Senellartmachine translationMicrosoftneural machine translationSDLsogouSystranTencentubiqusWechat
SHARE
Gino Diño

By Gino Diño

Content strategy expert and Online Editor for Slator; father, husband, gamer, writer―not necessarily in that order.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Pro Guide: Translation Pricing and Procurement

Pro Guide: Translation Pricing and Procurement

by Slator

Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Press Releases

See all
Smartling Announces Smartling+

Smartling Announces Smartling+

by Smartling

XTM Cloud 12.7 “Intelligent Connectivity” is Here

XTM Cloud 12.7 “Intelligent Connectivity” is Here

by XTM International

LocHub Announces QA Localization Solution For Multilingual Content Publishing Processes

LocHub Announces QA Localization Solution For Multilingual Content Publishing Processes

by Xillio

Upcoming Events

See All
  1. Smartling - Global Ready Conference 2021

    Global Ready Conference

    by Smartling

    · April 14

    When you can't traverse the world, let the world come to you. Join our annual global event from home.

    More info FREE

Featured Companies

See all
Sunyu Transphere

Sunyu Transphere

Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

SeproTec

SeproTec

Smartling

Smartling

XTM International

XTM International

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Google Translate Not Ready for Use in Medical Emergencies But Improving Fast — Study

Google Translate Not Ready for Use in Medical Emergencies But Improving Fast — Study

by Seyma Albarino

The Slator 2021 Language Service Provider Index

The Slator 2021 Language Service Provider Index

by Slator

DeepL Adds 13 European Languages as Traffic Continues to Surge

DeepL Adds 13 European Languages as Traffic Continues to Surge

by Marion Marking

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,800 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.