logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • SlatorCon Remote May 2021
    • Email Marketing for Freelance Linguists
    • Preparing for the Critical Google Update Coming in May 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • SlatorCon Remote May 2021
    • Email Marketing for Freelance Linguists
    • Preparing for the Critical Google Update Coming in May 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Register For Email Marketing for Freelance Linguists and Learn How To Win New Clients.

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
Machine Translates Literature and About 25% Was Flawless, Research Claims

3 years ago

January 19, 2018

Machine Translates Literature and About 25% Was Flawless, Research Claims

Technology ·

by Gino Diño

On January 19, 2018

3 years ago
Technology ·

by Gino Diño

On January 19, 2018

Machine Translates Literature and About 25% Was Flawless, Research Claims

A neural machine translation (NMT) system was trained in the domain of literary translation and between one sixth and one third of its translations were indistinguishable from professional human translation – at least according to the people asked by researchers to evaluate the machine’s output.

During a panel discussion at SlatorCon Zürich in December 2017, NMT expert Samuel Läubli was asked when he thought NMT systems would be fluent enough to handle stylistic problems like irony. Läubli, who presented three reasons why NMT was a breakthrough that day, declined to forecast a timeline.

Yet it seems that by switching to neural networks machine translation is getting closer. Dr. Antonio Toral, Assistant Professor at the University of Groningen and Prof. Andy Way, Professor in Computing and Deputy Director of the EU’s ADAPT Centre for Digital Content Technology, filed a research paper on Arxiv.org.

Advertisement

Their research is titled “What Level of Quality can Neural Machine Translation Attain on Literary Text?”

The answer: NMT was “significantly better” than phrase-based statistical MT (PBSMT), and more importantly, “human evaluation… shows that between 17% and 34% of the translations… are perceived by native speakers of the target language to be of equivalent quality to translations produced by a professional human translator.”

Doing It over Again

Toral and Way have done this before in 2015, strictly on PBSMT and a smaller scale of training data and results analysis. Now, they did it again with neural MT. They said the availability of online ebooks and their translations as training data encouraged them to pursue the research.

They trained PBSMT and NMT systems with over 100 million words of in-domain training data (they used parallel, monolingual, and also out-of-domain datasets as well). They then set them to the task of translating 12 well-known novels published from the 1920s to the present day. Specifically:

  1. Auster’s Sunset Park (2010)
  2. Collins’ Hunger Games #3 (2010)
  3. Golding’s Lord of the Flies (1954)
  4. Hemingway’s The Old Man and the Sea (1952)
  5. Highsmith’s Ripley Under Water (1991)
  6. Hosseini’s A Thousand Splendid Suns (2007)
  7. Joyce’s Ulysses (1922)
  8. Kerouac’s On the Road (1957)
  9. Orwell’s 1984 (1949)
  10. Rowling’s Harry Potter #7 (2007)
  11. Salinger’s The Catcher in the Rye (1951)
  12. Tolkien’s The Lord of the Rings #3 (1955)

They chose English to Catalan translations for two main reasons. First, it was more challenging than their initial salvo in 2015 when they used PBSMT to translate Spanish to Catalan.

Second, Catalan is a mid-size European language with a lot of available training data but much room for future novel translations compared to other major languages if research shows that NMT is useful in assisting literary translators.

Orwell Seems Harder than Salinger

Toral and Way used automated BLEU (Bilingual Evaluation Understudy) scoring to compare the results of the PBSMT and NMT translations, but also made use of blind human evaluation from two native Catalan speakers with advanced English skills and a background in linguistics.

In automated BLEU scoring, NMT consistently outperformed PBSMT with an overall “11% relative improvement.”

For human evaluation, they assessed 10 passages of 10 contiguous sentences from three of the 12 novels translated: Orwell’s 1984, Rowling’s Harry Potter #7, and Salinger’s The Catcher in the Rye.

Their findings: “In all three books, the percentage of sentences where the annotators perceive the MT translation to be of equivalent quality to the human translation is considerably higher for NMT compared to PBSMT.”

“If NMT translations were to be used to assist a professional translator (e.g. by means of post-editing), then around one third of the sentences for Rowling’s and Salinger’s and one sixth for Orwell’s would not need any correction.”

Slator reached out to Prof. Andy Way regarding their research. Asked whether this same system would perform equally as well in marketing or other “more literary” domains that language service providers are active in, Prof. Way said “I worked in industry for three years building cutting-edge MT systems for a range of leading international companies, and the one area that I used to tell the sales team to avoid was marketing material, which requires more transcreation as a solution than translation per se. So I think this will remain an area where human translation/transcreation will continue to dominate.”

Asked if more and better in-domain training data can improve performance, Prof. Way was more optimistic: “I know of no examples where additional such data is not extremely useful, so yes, absolutely!”

Download the Slator 2019 Neural Machine Translation Report for the latest insights on the state-of-the art in neural machine translation and its deployment.

Slator 2019 Neural Machine Translation Report: Deploying NMT in Operations

Data and Research
32 pages, NMT state-of-the-art, 5 case studies, 30 commentaries, NMT in day-to-day operations
$85 BUY NOW

For expert analysis and insights on the current state-of-the-art in neural machine translation, purchase Slator’s Neural Machine Translation 2018 Report.

Neural Machine Translation in Use for Localization

Slator Neural Machine Translation Report 2018

Data and Research
Published March 2018. 35-page report. Current state and business case for NMT with expert commentary from over a dozen industry experts and academic researchers.
$48 BUY NOW

TAGS

Andy Wayliterary translation Antonio Toralneural machine translation
SHARE
Gino Diño

By Gino Diño

Content strategy expert and Online Editor for Slator; father, husband, gamer, writer―not necessarily in that order.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Pro Guide: Sales and Marketing for Language Service Providers

Pro Guide: Sales and Marketing for Language Service Providers

by Slator

Press Releases

See all
Marc Westray of Interpreters Unlimited Wins Rising Stars in Marketing Award

Marc Westray of Interpreters Unlimited Wins Rising Stars in Marketing Award

by Interpreters Unlimited

6CONNEX to Partner with Interprefy to Help Clients Host Large Scale Events in Any Language

6CONNEX to Partner with Interprefy to Help Clients Host Large Scale Events in Any Language

by Interprefy

BLEND Raises $10m to Fuel Global Growth with End-to-end Localization Services

BLEND Raises $10m to Fuel Global Growth with End-to-end Localization Services

by BLEND

Upcoming Events

See All
  1. Smartling - Global Ready Conference 2021

    Global Ready Conference

    by Smartling

    · April 14

    When you can't traverse the world, let the world come to you. Join our annual global event from home.

    More info FREE

Featured Companies

See all
Sunyu Transphere

Sunyu Transphere

Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

Seprotec

Seprotec

Versacom

Versacom

Smartling

Smartling

XTM International

XTM International

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Poland Rules on LSP Using Google Translate; Defines ‘Professional Translator’

Poland Rules on LSP Using Google Translate; Defines ‘Professional Translator’

by Marion Marking

The Slator 2021 Language Service Provider Index

The Slator 2021 Language Service Provider Index

by Slator

Behind the Scenes of the European Parliament’s Pivot to Remote Interpreting

Behind the Scenes of the European Parliament’s Pivot to Remote Interpreting

by Seyma Albarino

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,500 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.