logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Advertise on Slator! Download the 2021 Online Media Kit Now

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
Facebook Open Sources AI Framework That Now Powers 6 Billion Translations a Day

3 years ago

May 7, 2018

Facebook Open Sources AI Framework That Now Powers 6 Billion Translations a Day

Technology ·

by Eden Estopace

On May 7, 2018

3 years ago
Technology ·

by Eden Estopace

On May 7, 2018

Facebook Open Sources AI Framework That Now Powers 6 Billion Translations a Day

Facebook is open sourcing a tool chain of machine learning (ML) and artificial intelligence (AI) tools that it uses to power many of its own products, including Translate, its open source project based on the company’s machine translation systems.

On the second day of F8, Facebook’s annual development conference being held this week in San Jose, California, Chief Technology Officer Mike Schroepfer took the stage to explain the evolution of its AI tool chain and why it is crucial to share it with developers.

Facebook calls the tool chain “PyTorch 1.0.” It includes PyTorch, the open source deep learning framework Facebook pioneered around 15 months ago, the deep learning framework called Caffe2 launched two years ago, and finally, the Open Neural Network Exchange (ONNX).

Advertisement

Facebook’s PyTorch 1.0 Tool Chain

During F8, three presentations regarding NMT at Facebook were delivered by Engineering Managers Necip Fazil Ayan and Ves Stoyanov, and Research Scientist Juan Pino.

Pino explained that two years ago, Facebook experimented on ideas with PyTorch, and manually re-implemented any successful tests into Caffe2 for production. However, according to him, this process took a lot of time and did not scale.

So they developed ONNX, “an industry-wide effort led by Facebook, Amazon, and Microsoft,” Pino said.

“In our particular use-case,” he explained, “we leverage ONNX to essentially export a model from one framework to the other.” Pino said ONNX improved model deployment by becoming a middleman that made the process automatic instead of manual, speeding up re-implementation of PyTorch ideas into Caffe2 production environments.

This same tool chain that Facebook will be open-sourcing—PyTorch, ONNX, and Caffe2—is currently in use across the company’s products, such as Facebook and Messenger, Instagram, and Workplace.

A post in Facebook’s developer’s blog reveals that PyTorch 1.0 will be available in beta within the next few months, and “will include a family of tools, libraries, pre-trained models, and datasets for each stage of development, enabling the community to quickly create and deploy new AI innovations at scale.”

NMT at Facebook

During the NMT keynote presentation, before Pino explained the relevance of the PyTorch 1.0 package, Ayan opened by updating the audience with Facebook’s multilingual challenges and previous tools they developed to provide incremental solutions to these problems. This included Graph Search in multiple languages, multilingual composer, and one of their most recent deployments: M Suggestions on Facebook Messenger, which automatically translates messages.

Ayan touched upon why NMT and natural language processing (NLP) were a paired effort at Facebook. He explains that their AI work not only needs to be capable of translating from multiple languages, but also understanding context, so it can perform anything from trivial functions like multilingual event suggestions to more serious ones like flagging posts for suicide prevention.

Pino then talked about NMT specifically. Compared to 2017’s F8 conference when Facebook was only running 50% of its translations through NMT and doing two billion translations a day, Pino provided an update: now they are 100% neural and they handle 5.95bn translations a day. He also also discussed Facebook’s NMT model and shared the two major hurdles when it comes to deployment: low-resource languages and scale.

MUSE for Low-Resource Languages

Finally, after Ayan and Pino, Stoyanov took to the stage to discuss Facebook’s work on multilingual understanding, or what he called “how we build natural language processing applications in a multilingual world.”

Stoyanov briefly explained their multilingual understanding projects, including research on word embeddings that then allows their NMT engines to undergo unsupervised learning, or training with only monolingual data.

“Now, amazingly, this whole process works,” Stoyanov said. “We can learn the correspondence between words without having any supervision; without even knowing a single word that’s in common between the two languages.”

Stoyanov also touched upon expanding their research from word embeddings to sentence embeddings, to capture universal similarities across languages in the way they structure sentences and use that learning for unsupervised training of NLP and MT models.

Facebook’s research on Multilingual Unsupervised and Supervised Embeddings (MUSE) is likewise being open-sourced. Stoyanov added: “we are also open sourcing our datasets for multilingual understanding.”

Training data limitations are a major hurdle for so-called low-resource languages, language pairs with very little parallel corpora between them. This makes it difficult to build fluent NMT systems for such pairs since neural network engines typically require a large amount of parallel corpora to be effective.

Facebook CTO Schroepfer said their early work in MUSE among many areas of research is “promising work to bring all the tools and technologies they have to 6,000 languages all around the world.”

Since last year, the social media giant has been besieged by a mountain of problems concerning data privacy, hate speech, bullying, propaganda, and fake news on the platform, which led to CEO Mark Zuckerberg’s testimony at Capitol Hill before the US Senate.

Facebook executives at F8 explained the importance of NLP in its work on keeping the social network clean and to automate many processes at scale.

With additional reports from Gino R. Diño

Download the Slator 2019 Neural Machine Translation Report for the latest insights on the state-of-the art in neural machine translation and its deployment.

Slator 2019 Neural Machine Translation Report: Deploying NMT in Operations

Data and Research
32 pages, NMT state-of-the-art, 5 case studies, 30 commentaries, NMT in day-to-day operations
$85 BUY NOW

TAGS

artificial intelligenceF8 conferenceFacebookFacebook Translatenatural language processingPyTorch 1.0translation
SHARE
Eden Estopace

By Eden Estopace

IT journalist and Online Editor at Slator. Loves books, movies, and gadgets; writes for a living, but codes for fun.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Pro Guide: Sales and Marketing for Language Service Providers

Pro Guide: Sales and Marketing for Language Service Providers

by Slator

Press Releases

See all
XTRF Launches a Bi-Monthly Free Networking Event for Localization Professionals

XTRF Launches a Bi-Monthly Free Networking Event for Localization Professionals

by XTRF

150 Million Words Translated: the German EU Council Presidency Translator Sets New Records

150 Million Words Translated: the German EU Council Presidency Translator Sets New Records

by Tilde

BeLazy Announces Full Automation for Plunet

BeLazy Announces Full Automation for Plunet

by BeLazy

Upcoming Events

See All
  1. Memsource MT Post-Editing Pricing Models Webinar

    Pricing Models for MT Post-Editing Workshop

    by Memsource

    · February 3

    Hear a panel of innovative localization professionals share different approaches for MT post-editing pricing.

    More info FREE

Featured Companies

See all
Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

Seprotec

Seprotec

Versacom

Versacom

SDL

SDL

Smartling

Smartling

Lingotek

Lingotek

XTM International

XTM International

Smartcat

Smartcat

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Why Netflix Shut Down Its Translation Portal Hermes

Why Netflix Shut Down Its Translation Portal Hermes

by Esther Bond

The Slator 2020 Language Service Provider Index

The Slator 2020 Language Service Provider Index

by Slator

Top Language Industry Quotes of 2020

Top Language Industry Quotes of 2020

by Monica Jamieson

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,000 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.