How Netflix Researchers Simplify Subtitles for Translation

Netflix Subtitles Machine Translation for Localization

As original productions of media entertainment content have come to a halt amid coronavirus lockdowns, streaming services have turned their attention to localizing back-catalog content into more languages.

With high levels of localization demand, even in times of lockdown, streaming providers such as Amazon Prime Video are increasingly active participants in the machine translation (MT) research space.

Streaming giant Netflix confirmed back in April 2019 that they had not yet rolled out MT for their subtitle operations, but said they were investigating the use of the technology. Investigating they are: In May 2020, a paper published by a group of computer scientists at Netflix explored how to improve MT quality for low-resource languages, with the intended use likely to be in subtitles and meta-descriptions.

The paper, entitled “Simplify-then-Translate: Automatic Preprocessing for Black-Box Translation,” was published on pre-print platform arXiv on May 22, 2020. The study is a collaboration between former Netflix Research Intern Sneha Mehta, former Engineering Manager Ballav Bihani, and current Netflix employees Bahareh Azarnoush, Data Science Manager, Boris Chen, Machine Learning Engineer, Vinith Misra, Artwork and Video Data Science Manager, Avneesh Saluja, Research Scientist, and Ritwik Kumar, Machine Learning Director.

Kumar’s LinkedIn profile provides a glimpse into wider MT-related research areas at Netflix, and lists a number of the team’s projects: deep learning for high-quality machine translations, predicting per-title language demand, and deep learning for text understanding such as customer complaint mining.

Azarnoush’s LinkedIn profile also outlines her mandate to “partner with localization experts to unleash the power of data to transcend language barriers and ensure the best local user experience at scale.” Her focus includes, for one thing, “experimentation and causal inference to support localization decisions.”


Netflix’s Simplify-Then-Translate paper brings together two natural language processing (NLP) disciplines: sentence simplification and machine translation.

Back-translations are simpler than the original source sentences and can be used to build a simplification model. This is what is novel about Netflix’s approach

Sentence simplification is nothing new. As the paper points out, sentence simplification was originally explored in the 1990s as a way to improve machine translation. The idea was that simpler source sentences lead to more fluent translations and “reduce technical post-editing effort.”

Netflix’s method relies on this premise and also leverages the notion that translated content is fundamentally simpler than original source content. By extension, they argued, back-translations are simpler than the original source sentences and can be used to build a simplification model. This is what is novel about Netflix’s approach.

First, Netflix took content previously translated by humans (reference translations) and back-translated it into the original source language using MT; in this case, English. From there, the researchers used the simpler, back-translated sentences to build a simplification model for English sentences.

The simplification model — called an automatic pre-processing model or APP — would then be applied to any English source content prior to the machine translation step, to improve the resulting output.

Netflix’s flagship APP for English, the figsAPP, is built specifically to tackle tricky content such as idioms by replacing such expressions with a simplified alternative. Given that they focus on “conversational language as used in dialogues of TV shows, [which] tends to be colloquial and idiomatic,” Netflix judged that it was important to use reference translations from this domain.

Suitably, Netflix used entertainment content in high-resource languages to build the figsAPP, employing French, Italian, German, and Spanish (FIGS) reference translations for a number of titles including “How to Get Away with Murder,” “Star Trek: Deep Space Nine,” and “Full Metal Alchemist.”

Testing Low-Resource Languages

To conduct their experiments, Netflix used a “black-box” machine translation system, Google Translate. To test the results of the figsAPP against an out-of-domain simplification dataset, Netflix machine-translated simplified content into seven low-resource languages: Hungarian, Ukrainian, Czech, Romanian, Bulgarian, Hindi, and Malay.

Source content that had been simplified with the figsAPP resulted in better quality translations in all seven languages, compared to translations resulting from non-simplified, original source content. Source content pre-processed with the out-of-domain APP performed significantly worse than the original, confirming Netflix’s hypothesis that using domain-specific content improves the performance of the APP.

Netflix also looked at the Translation Edit Rate (TER), and found that using figsAPP-treated source content improved edit distance by between 1.3% to 7.3% for the seven languages tested. This is “intuitive,” Netflix said, “because the APP simplification brings the sentences closer to their literal human translation.”

The researchers also used humans to evaluate the quality of a sample of the translations resulting from figsAPP-treated source content for five of the seven low-resource languages. Here, too, Netflix found that, at least for three languages, figsAPP-treatment resulted in improved translation output.

Although English source content is Netflix’s primary focus for the purposes of the research, APPs can also be built in any language for which enough corresponding reference translations exist.