logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Email Marketing for Freelance Linguists
    • Preparing for the Critical Google Update Coming in May 2021
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Email Marketing for Freelance Linguists
    • Preparing for the Critical Google Update Coming in May 2021
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Register For Email Marketing for Freelance Linguists and Learn How To Win New Clients.

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
Facebook Open Sources AI Framework That Now Powers 6 Billion Translations a Day

3 years ago

May 7, 2018

Facebook Open Sources AI Framework That Now Powers 6 Billion Translations a Day

Technology ·

by Eden Estopace

On May 7, 2018

3 years ago
Technology ·

by Eden Estopace

On May 7, 2018

Facebook Open Sources AI Framework That Now Powers 6 Billion Translations a Day

Facebook is open sourcing a tool chain of machine learning (ML) and artificial intelligence (AI) tools that it uses to power many of its own products, including Translate, its open source project based on the company’s machine translation systems.

On the second day of F8, Facebook’s annual development conference being held this week in San Jose, California, Chief Technology Officer Mike Schroepfer took the stage to explain the evolution of its AI tool chain and why it is crucial to share it with developers.

Facebook calls the tool chain “PyTorch 1.0.” It includes PyTorch, the open source deep learning framework Facebook pioneered around 15 months ago, the deep learning framework called Caffe2 launched two years ago, and finally, the Open Neural Network Exchange (ONNX).

Advertisement

Facebook’s PyTorch 1.0 Tool Chain

During F8, three presentations regarding NMT at Facebook were delivered by Engineering Managers Necip Fazil Ayan and Ves Stoyanov, and Research Scientist Juan Pino.

Pino explained that two years ago, Facebook experimented on ideas with PyTorch, and manually re-implemented any successful tests into Caffe2 for production. However, according to him, this process took a lot of time and did not scale.

So they developed ONNX, “an industry-wide effort led by Facebook, Amazon, and Microsoft,” Pino said.

“In our particular use-case,” he explained, “we leverage ONNX to essentially export a model from one framework to the other.” Pino said ONNX improved model deployment by becoming a middleman that made the process automatic instead of manual, speeding up re-implementation of PyTorch ideas into Caffe2 production environments.

This same tool chain that Facebook will be open-sourcing—PyTorch, ONNX, and Caffe2—is currently in use across the company’s products, such as Facebook and Messenger, Instagram, and Workplace.

A post in Facebook’s developer’s blog reveals that PyTorch 1.0 will be available in beta within the next few months, and “will include a family of tools, libraries, pre-trained models, and datasets for each stage of development, enabling the community to quickly create and deploy new AI innovations at scale.”

NMT at Facebook

During the NMT keynote presentation, before Pino explained the relevance of the PyTorch 1.0 package, Ayan opened by updating the audience with Facebook’s multilingual challenges and previous tools they developed to provide incremental solutions to these problems. This included Graph Search in multiple languages, multilingual composer, and one of their most recent deployments: M Suggestions on Facebook Messenger, which automatically translates messages.

Ayan touched upon why NMT and natural language processing (NLP) were a paired effort at Facebook. He explains that their AI work not only needs to be capable of translating from multiple languages, but also understanding context, so it can perform anything from trivial functions like multilingual event suggestions to more serious ones like flagging posts for suicide prevention.

Pino then talked about NMT specifically. Compared to 2017’s F8 conference when Facebook was only running 50% of its translations through NMT and doing two billion translations a day, Pino provided an update: now they are 100% neural and they handle 5.95bn translations a day. He also also discussed Facebook’s NMT model and shared the two major hurdles when it comes to deployment: low-resource languages and scale.

MUSE for Low-Resource Languages

Finally, after Ayan and Pino, Stoyanov took to the stage to discuss Facebook’s work on multilingual understanding, or what he called “how we build natural language processing applications in a multilingual world.”

Stoyanov briefly explained their multilingual understanding projects, including research on word embeddings that then allows their NMT engines to undergo unsupervised learning, or training with only monolingual data.

“Now, amazingly, this whole process works,” Stoyanov said. “We can learn the correspondence between words without having any supervision; without even knowing a single word that’s in common between the two languages.”

Stoyanov also touched upon expanding their research from word embeddings to sentence embeddings, to capture universal similarities across languages in the way they structure sentences and use that learning for unsupervised training of NLP and MT models.

Facebook’s research on Multilingual Unsupervised and Supervised Embeddings (MUSE) is likewise being open-sourced. Stoyanov added: “we are also open sourcing our datasets for multilingual understanding.”

Training data limitations are a major hurdle for so-called low-resource languages, language pairs with very little parallel corpora between them. This makes it difficult to build fluent NMT systems for such pairs since neural network engines typically require a large amount of parallel corpora to be effective.

Facebook CTO Schroepfer said their early work in MUSE among many areas of research is “promising work to bring all the tools and technologies they have to 6,000 languages all around the world.”

Since last year, the social media giant has been besieged by a mountain of problems concerning data privacy, hate speech, bullying, propaganda, and fake news on the platform, which led to CEO Mark Zuckerberg’s testimony at Capitol Hill before the US Senate.

Facebook executives at F8 explained the importance of NLP in its work on keeping the social network clean and to automate many processes at scale.

With additional reports from Gino R. Diño

Download the Slator 2019 Neural Machine Translation Report for the latest insights on the state-of-the art in neural machine translation and its deployment.

Slator 2019 Neural Machine Translation Report: Deploying NMT in Operations

Data and Research
32 pages, NMT state-of-the-art, 5 case studies, 30 commentaries, NMT in day-to-day operations
$85 BUY NOW

TAGS

artificial intelligenceF8 conferenceFacebookFacebook Translatenatural language processingPyTorch 1.0translation
SHARE
Eden Estopace

By Eden Estopace

IT journalist and Online Editor at Slator. Loves books, movies, and gadgets; writes for a living, but codes for fun.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Pro Guide: Sales and Marketing for Language Service Providers

Pro Guide: Sales and Marketing for Language Service Providers

by Slator

Press Releases

See all
Iconic Launches INTRA Translation Platform

Iconic Launches INTRA Translation Platform

by Iconic

Pangeanic Is Now Certified to ISO 27001 Information Security

Pangeanic Is Now Certified to ISO 27001 Information Security

by Pangeanic

VSI Acquires Leading Brazilian Dubbing Studio, Vox Mundi

VSI Acquires Leading Brazilian Dubbing Studio, Vox Mundi

by VSI

Upcoming Events

See All
  1. Multilingual Winter Series

    Let’s Talk About the Future of the Localization Industry

    by Lionbridge

    · February 25

    Participate in an easy-paced 90-minute conversation with the minds that lead and influence the direction of the...

    More info FREE

Featured Companies

See all
Sunyu Transphere

Sunyu Transphere

Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

Seprotec

Seprotec

Versacom

Versacom

Smartling

Smartling

XTM International

XTM International

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Poland Rules on LSP Using Google Translate; Defines ‘Professional Translator’

Poland Rules on LSP Using Google Translate; Defines ‘Professional Translator’

by Marion Marking

Behind the Scenes of the European Parliament’s Pivot to Remote Interpreting

Behind the Scenes of the European Parliament’s Pivot to Remote Interpreting

by Seyma Albarino

Why Netflix Shut Down Its Translation Portal Hermes

Why Netflix Shut Down Its Translation Portal Hermes

by Esther Bond

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,500 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.