logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Advertise on Slator! Download the 2021 Online Media Kit Now

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
Is Google’s New Lingvo Framework a Big Deal for Machine Translation?

2 years ago

March 8, 2019

Is Google’s New Lingvo Framework a Big Deal for Machine Translation?

Machine Translation ·

by Gino Diño

On March 8, 2019

2 years ago
Machine Translation ·

by Gino Diño

On March 8, 2019

Is Google’s New Lingvo Framework a Big Deal for Machine Translation?

Researchers in neural machine translation (NMT) and natural language processing (NLP) may want to keep an eye on a new framework from Google. The Google AI team recently open-sourced Lingvo, “a framework for building neural networks in TensorFlow,” according to its GitHub page.

Lingvo is specifically tailored toward sequence models and NLP, which includes speech recognition, language understanding, MT, and speech translation. The Google AI team claims there are already “dozens” of research papers in these areas based on Lingvo. In fact, they said this was one reason they decided to open-source the project: to support the research community and encourage reproducible results.

Lingvo supports multiple neural network architectures — from recurrent neural nets to Transformer models — and comes with lots of documentation on common implementations across different tasks (i.e., NLP, NMT, speech synthesis).

Advertisement

The framework also provides a centralized location for configurations used by anyone who wants to try it out, which is meant to make documentation easier and results more readily reproducible.

So, just how big a deal is it?

“A good tool” and “very welcome,” but…

Diego Bartolome, Director of Artificial Intelligence at TransPerfect, said “Any effort in the community to share results and allow researchers and developers to easily train models and compare results with other state-of-the-art techniques is very welcome.”

An impressive compilation of high-quality code. — Jean Senellart, CEO, Systran

Jean Senellart, CEO of Systran and one of the earliest proponents of NMT technology, said Lingvo “is an impressive compilation of high-quality code,” noting the more than 50,000 lines of code.

Guillaume Klein, Research Engineer at Systran and architect of the open source OpenNMT project started by Harvard and Systran in 2016, commented that it also included “production-oriented components, which is rare in this type of open source project.”

Slator 2019 Neural Machine Translation Report: Deploying NMT in Operations

Data and Research
32 pages, NMT state-of-the-art, 5 case studies, 30 commentaries, NMT in day-to-day operations
$85 BUY NOW

Rohit Gupta, Senior MT Scientist at Iconic Translation Machines said Lingvo is simply “a good tool.” He noted that as it is built on TensorFlow, anyone already using the library will likely find Lingvo useful.

That seems all well and good, but is Lingvo actually more conducive for research purposes and what makes it better or worse than other frameworks?

Gupta said TensorFlow is certainly very popular, “but there are other deep learning libraries that are used extensively such as PyTorch (from Facebook), which runs on the Fairseq framework, and MXNet, which is used by AWS.” He said researchers will typically work and build upon existing implementations that are then used as a reference; meaning “if Lingvo catches on, it could be helpful given the way certain elements are bundled.”

Senellart pointed out that most of the papers in Lingvo’s publication list are from Google researchers themselves. He compared that list to over 500 papers with state-of-the-art results using or quoting OpenNMT, the current and most popular implementations of which are managed by Systran and Ubiqus, who joined the project in 2017. Looking at the number of issues or pull requests on Lingvo’s Github page, Senellart said there is no major adoption yet.

It is very unlikely that the authors will actively support [the] community if it reaches very large adoption.

“However, it is impressive that some are production papers or projects, such as GNMT — the Google Translate engines — which also demonstrate once more that, for NMT, there are no more secret implementations,” he noted.

Senellart listed some of the pros of Lingvo: the strong team (Google AI team) behind it, its “production-ready code,” its stability, and its flexibility in terms of the tasks it can handle. As for its cons, he said the “code might be a bit overwhelming for the research community [and] being generic makes it less simple to enter into for any specific task.”

He added that “by nature of the team, it is very unlikely that the authors will actively support [the] community if it reaches very large adoption. Successful open source projects need to spend a lot of energy in community support.”

Very Google-centric by design (leverages TPUs, clusters, etc.) — most outsiders don’t have access to such infrastructure. — Guillaume Klein, Research Engineer, Systran

Meanwhile, Klein said that “packing many reusable components and recipes in a single repository is great for research productivity. Lingvo is “very Google-centric by design (leverages tensor processing units or TPUs, clusters, etc.) — most outsiders don’t have access to such infrastructure.” He noted that in contrast, OpenNMT is agnostic to technology providers.

Bartolome echoed the same sentiment, noting that “integrating new code or technology (even if open source) requires effort, and companies should evaluate their return on the investment before doing so.”

Just another horse?

So is Lingvo, ultimately, just another horse in the race for Google? Bartolome thinks it is par for the course for tech companies: “It seems that making it open source is a way to check if the community adopts it, and this is what all tech giants are doing now. They realize that having the winning framework is important, so I often see these announcements as marketing for researchers.”

I often see these announcements as marketing for researchers. — Diego Bartolome, Director of Artificial Intelligence, TransPerfect

At this time, Bartolome said it is uncertain whether Lingvo actually provides researchers with an advantage. “I haven’t seen a lot of mentions in my community, so it is starting slow indeed, differently to [for example] BERT (Bidirectional Encoder Representations). It is a way for Google to try to add more frameworks on top of TensorFlow to make it more valuable,” he said.

John Tinsley, CEO and Co-founder of Iconic and Gupta’s colleague, said it would be natural for Google to want one of their platforms to become the standard. “Such open sourcing is a trend across the big tech companies and there’s a lot of discussion as to the motivations,” Tinsley said, adding, “Suffice to say, if people are innovating on top of their platforms, it serves a broader goal of staying at the forefront of technology, and drives more users to their products.”

It is a way for Google to try to add more frameworks on top of TensorFlow to make it more valuable.

Gupta said some factors “might preclude uptake in the near term,” such as Lingvo not incorporating some implementations based on state-of-the-art approaches like BERT and XLM (Cross Lingual Language Model).

For Systran’s Senellart, the key point is whether Lingvo can reach critical mass in terms of adoption. “I don’t believe it will become de facto standard just by the fact that other frameworks, such as PyTorch, are as popular as TensorFlow and more simple to enter into,” he said. Additionally, he pointed out two other reasons why Google will not benefit much from Lingvo becoming an industry standard.

“As a production-oriented framework, I don’t think Google will open [Lingvo] massively to third-party contributors; also, as for all the deep learning code today, the code itself is not very important — all the good ideas are described in papers and are re-implemented as needed,” Senellart said.

Such open sourcing is a trend across the big tech companies and there’s a lot of discussion as to the motivations. — John Tinsley, CEO and Co-founder, Iconic Translation Machines

Klein added that a similar framework already exists as part of the TensorFlow ecosystem: Tensor2Tensor. He said Lingvo is a “great approach to standardize training practices in a project or team,” but that, in itself, will not be much help in making it an industry standard. He also noted that an incoming challenge for Lingvo would be to migrate its codebase to TensorFlow 2.0.

“Don’t expect Lingvo to have a great impact”

What sort of impact will Lingvo have on the industry?

According to Gupta, “That’s difficult to say. Generally, researchers will build upon whatever they’re already comfortable in. They’d also rather not be bound by the limitations or restrictions of a particular platform. So that lack of flexibility in higher-level libraries like Lingvo might slow uptake.”

For Tinsley, at least Lingvo does provide the advantage of easier replication of experiments. “There are implementations packaged with set parameters […] always a challenge when researchers are using different tools,” he said.

[Researchers would] rather not be bound by the limitations or restrictions of a particular platform. — Rohit Gupta, Senior MT Scientist, Iconic Translation Machines

Tinsley concluded, “We’re not sure it’s going to have a massively transformative impact, but it could be very useful. It remains to be seen. We’re looking forward to getting our hands dirty with it!”

Klein said researchers look for three different things: a small and easy to transform codebase, an easy to use and configure toolkit, or a ready-to-use, state-of-the-art model — and Lingvo falls under the latter.

Senellart added that there is no unique feature within Lingvo that would directly encourage researchers to switch from a framework in which they have already invested. So unless a very good reason comes along — such as TPUs becoming far simpler to access — “it is unlikely that existing researchers or users of a given framework will switch to another.”

Klein pointed out, “The impact would have been bigger if this project was released, say, in early 2017. However, releasing code of a previously published paper is always welcomed by the community and there are certainly things to learn from it.

For his part, Bartolome simply said, “I don’t expect Lingvo to have a great impact on the industry, at least [not] in the short term.”

TAGS

Diego BartolomeGoogleGoogle AIGuillaume KleinHarvardIconic Translation MachinesJean SenellartLingvomachine translationMTnatural language processingneural machine translationNLPNMTOpenNMTRohit GuptaSystrantensor2tensorTensorFlowTransPerfectubiqus
SHARE
Gino Diño

By Gino Diño

Content strategy expert and Online Editor for Slator; father, husband, gamer, writer―not necessarily in that order.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Pro Guide: Sales and Marketing for Language Service Providers

Pro Guide: Sales and Marketing for Language Service Providers

by Slator

Press Releases

See all
Seamless Transitions and the Latest AI-Powered Technologies – Tilde’s Success Story

Seamless Transitions and the Latest AI-Powered Technologies – Tilde’s Success Story

by XTRF

Live Stream Smartling’s Global Ready Conference on April 14, 2021

Live Stream Smartling’s Global Ready Conference on April 14, 2021

by Smartling

Rheinschrift Language Services – Strategic Improvements and Workforce Expansion in 2021

Rheinschrift Language Services – Strategic Improvements and Workforce Expansion in 2021

by Rheinschrift Language Services

Upcoming Events

See All
  1. Memsource MT Post-Editing Pricing Models Webinar

    Pricing Models for MT Post-Editing Workshop

    by Memsource

    · February 3

    Hear a panel of innovative localization professionals share different approaches for MT post-editing pricing.

    More info FREE

Featured Companies

See all
Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

Seprotec

Seprotec

Versacom

Versacom

SDL

SDL

Smartling

Smartling

Lingotek

Lingotek

XTM International

XTM International

Smartcat

Smartcat

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Why Netflix Shut Down Its Translation Portal Hermes

Why Netflix Shut Down Its Translation Portal Hermes

by Esther Bond

The Slator 2020 Language Service Provider Index

The Slator 2020 Language Service Provider Index

by Slator

Top Language Industry Quotes of 2020

Top Language Industry Quotes of 2020

by Monica Jamieson

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,000 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.