logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • SlatorCon Remote May 2021
    • Localizing at Scale for International Growth
    • Design Thinking May 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • SlatorCon Remote May 2021
    • Localizing at Scale for International Growth
    • Design Thinking May 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Register Before April 15th for SlatorCon Remote and Save 15%!

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
What Makes a Good Post-Editor: Research Examines Activity Patterns of over 300 Linguists

2 years ago

August 19, 2019

What Makes a Good Post-Editor: Research Examines Activity Patterns of over 300 Linguists

Machine Translation ·

by Esther Bond

On August 19, 2019

2 years ago
Machine Translation ·

by Esther Bond

On August 19, 2019

What Makes a Good Post-Editor: Research Examines Activity Patterns of over 300 Linguists

After one researcher recently asked (and answered) the question of whether post-editing can influence target language content, two other researchers have tackled the matter of predicting post-editing time by understanding the styles of post-editing.

António Góis, Research Scientist at Unbabel, and André Martins, Unbabel’s Head of Research, authored a paper entitled “Translator2Vec: Understanding and Representing Human Post-Editors” that was published on arXiv on July 24, 2019 by the European Association for Machine Translation.

The paper points to prior research on the effectiveness of post-editors, looking at a number of topics, among them: the relationship between pauses and cognitive effort, the use of novice versus professional post-editors for research purposes, and the impact of post-editor behavior such as planning ahead and mouse vis-à-vis keyboard use on overall performance.

Advertisement

The purpose of the Unbabel paper was to build on such work and find out whether it is possible to identify a specific post-editor based on their actions, whether meaningful representations of post-editors could be built that would allow researchers to draw useful conclusions, and, ultimately, whether these representations could prove useful in predicting the time needed to post-edit a document.

Slator 2020 Language Industry Market Report

Data and Research, Slator reports
55 pages. Total market size, biz dev and sales insights, TMS & MT review, buyer segment analysis, M&A, Covid impact & outlook.
$480 BUY NOW

The researchers started from the premise that “the combination of machines and humans for translation is effective.” And then they reference previous studies showing that humans are more productive when post-editing machine translation rather than translating from scratch.

“Understanding how human post-editors work could open the door to the design of better interfaces, smarter allocation of human translators to content, and automatic post-editing”

Moreover, they hypothesized, gaining an understanding of how humans perform the task of post-editing and which methods are most effective can help make the human-machine interaction in post-editing even more successful.

In practical terms, “understanding how human post-editors work could open the door to the design of better interfaces, smarter allocation of human translators to content, and automatic post-editing,” they posited.

Identifying “Good” Post-Editors

The study relied on a dataset of more than 66,000 source documents and involved more than 300 post-editors working from English into French and German. The source documents for translation were customer service email messages sent to Unbabel’s translation service. According to the researchers, the dataset was “the largest of the kind released to date” and “the only one we are aware of with document-level information.”

Slator 2019 Neural Machine Translation Report: Deploying NMT in Operations

Data and Research
32 pages, NMT state-of-the-art, 5 case studies, 30 commentaries, NMT in day-to-day operations
$85 BUY NOW

The researchers looked at common post-editing operations such as inserting, deleting, and replacing a word or block of words and also took into account keystrokes, mouse actions, and waiting times. From the way these operations, or “action sequences,” were carried out by individuals, they hoped to identify specific post-editors — and do so more reliably than they would by simply comparing machine translated text with post-edited text.

The researchers aimed to “understand which activity patterns characterize ‘good’ editors” in terms of translation quality and speed

Although identification of post-editors was an important part of the study, researchers were “not interested in the problem of editor identification per se, but only as a means to obtain good representations.”

For them, “good representations” were those that managed to group similar post-editors together in clusters. By interpreting these clusters, researchers wanted to “understand which activity patterns characterize ‘good’ editors” in terms of translation quality and speed.

Untapped Source of Information

The key findings of the study were threefold: first, “that action sequences can be used to perform accurate editor identification”; second, “that they can be used to learn human post-editor vector representations that cluster together similar editors”; third and crucially, “editor representations can be very effective for predicting human post-editing time.”

Slator contacted André Martins, co-author of the paper, for additional comment on the research. Martins explained that being able to predict the time someone will take to post-edit content can give useful in the context of matching linguists to a particular text type. Moreover, according to Martins, “it may also be used to inform customers about how long we expect a document to be translated.”

“Human post-editors who spend longer times reading before starting to type, tend to type fast and to always edit left to right. By contrast, those who type immediately tend to spend some time jumping back and forth.” — André Martins, Head of Research, Unbabel

Related to predicting editing time is the quality aspect. Martins said, “We are currently looking at ways to use this information for human translation quality estimation (i.e., predicting how good a translation is before sending it to the customer). This will allow us to detect eventual translation mistakes and re-assign the task to another human translator.”

SlatorSweep - Daily Market Intelligence

SlatorSweep

Data and Research, Market Intelligence
Curated news from thousands of sources, SlatorSweep’s daily news service gives you a competitive edge on time sensitive market intelligence.
BUY NOW

Understanding post-editing strategies also makes it possible to “design our interfaces to better promote those behaviors,” Martins added. He said one behavioral insight that surfaced during the study was that “human post-editors who spend longer times reading before starting to type, tend to type fast and to always edit left to right. By contrast, those who type immediately tend to spend some time jumping back and forth.”

Overall, the results demonstrate that the process of post-editing contains “precious information unavailable in the initial plus final translated document,” the authors wrote. They concluded that the post-editing process “is a rich and untapped source of information,” and it is the researchers’ hope that “the dataset we release can foster further research in this area.”

Post-editing productivity goes to the heart of Unbabel‘s business and operational model, of course. The company is one of the language industry’s most well-funded startups and in the spring of 2019 hired key researchers from Amazon Translate.

TAGS

academiaAndré F. T. MartinsAndré MartinsAntónio GóisarXivInstituto de Telecomunicaçoesmachine translationmachine translation post-editingMTPEMTpost editingpost-edited machine translationpost-edited MTresearchUnbabelvectors
SHARE
Esther Bond

By Esther Bond

Research Director at Slator. Localization enthusiast, linguist and inquisitor. London native.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Pro Guide: Translation Pricing and Procurement

Pro Guide: Translation Pricing and Procurement

by Slator

Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Press Releases

See all
Smartling Announces Smartling+

Smartling Announces Smartling+

by Smartling

XTM Cloud 12.7 “Intelligent Connectivity” is Here

XTM Cloud 12.7 “Intelligent Connectivity” is Here

by XTM International

LocHub Announces QA Localization Solution For Multilingual Content Publishing Processes

LocHub Announces QA Localization Solution For Multilingual Content Publishing Processes

by Xillio

Upcoming Events

See All
  1. T-Update-2021

    T-UPDATE ’21 VIRTUAL

    by Gerard Castañeda

    · April 15

    Join us at the leading language Industry event for decision-makers. Just pack your agenda for 2 days and travel to the...

    More info €65-421

Featured Companies

See all
Sunyu Transphere

Sunyu Transphere

Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

SeproTec

SeproTec

Versacom

Versacom

Smartling

Smartling

XTM International

XTM International

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Google Translate Not Ready for Use in Medical Emergencies But Improving Fast — Study

Google Translate Not Ready for Use in Medical Emergencies But Improving Fast — Study

by Seyma Albarino

The Slator 2021 Language Service Provider Index

The Slator 2021 Language Service Provider Index

by Slator

DeepL Adds 13 European Languages as Traffic Continues to Surge

DeepL Adds 13 European Languages as Traffic Continues to Surge

by Marion Marking

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,800 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.