logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Advertise on Slator! Download the 2021 Online Media Kit Now

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
Is Transcription the Canary in the Translation Gold Mine?

4 years ago

January 20, 2017

Is Transcription the Canary in the Translation Gold Mine?

Features ·

by Marion Marking

On January 20, 2017

4 years ago
Features ·

by Marion Marking

On January 20, 2017

Is Transcription the Canary in the Translation Gold Mine?

Transcription and written translation share a number of similarities (convert language input in one form or another into text) and have their differences (transcription typically involves only one language). From a supply-chain point of view, the two activities are similar as they allow for work to be instantly sent across the globe.

Arguably, transcription is the less complex activity of the two. While there is such a thing as a factually perfect transcription (i.e., everything said was perfectly transcribed), a translation, by default, is an interpretation of the source content and, therefore, open to debate.

As progress in machine learning and artificial intelligence accelerates, one would expect its impact on the human workforce to be experienced first in transcription.

Advertisement

Exactly three weeks after Slator broke the story on Google’s “nearly indistinguishable” claim when they launched the new, neural-network-powered Google Translate, Microsoft claimed a “historic achievement” on their official blog: speech recognition technology that “recognizes the words in a conversation as well as a person does.”

The subject of the blog article was a paper published the day prior, on October 17, 2016, entitled “Achieving Human Parity in Conversational Speech Recognition” and authored by the Microsoft Speech & Dialogue research group. In it, Microsoft’s researchers say they had achieved a word error rate of 5.9%, an improvement from the 6.3% the team reported a month before. Said Microsoft Engineer and Chief Speech Scientist Xuedong Huang, “We’ve reached human parity. This is an historic [sic] achievement.

Depending who [sic] you ask, speech recognition is either solved or impossible. The truth is somewhere in between—Gerald Friedland, UC Berkeley

The word error rate or WER is a common metric for evaluating speech recognition, as the BLEU score is for machine translation. The blog article went on to point out that a 5.9% error rate is “about equal to that of people who were asked to transcribe the same conversation.” The goal being, of course, to approach zero and achieve 100% accuracy.

Just 10 days before Microsoft published its paper, PC World Senior Editor Mark Hachman had called Microsoft’s speech recognition the “weakness no one mentions” and said his Windows Speech Recognition test drive yielded a 6.4% word error rate, which was “pretty bad on paper.”

Qualifying that it was just the baseline and that, properly trained, Microsoft employees claim their speech recognition can achieve 99% accuracy, Hachman nonetheless concluded that training speech within Windows is a lengthy process. It actually took 10 minutes, which Hachman said felt “like a lifetime.”

He went on to say that training speech is faster (“perhaps a minute or so”) for Nuance’s speech recognition software Dragon, which announced its own breakthrough back in August. According to Vlad Sejnoha, Chief Technology Officer at Nuance, Dragon’s deep neural nets can now “continuously learn from the user’s speech…and drive accuracy rates in some instances up to 24% higher.”

Dragon and similar voice recognition software is to transcriptionists what translation productivity tools (or CAT tools) are to translators: They do not replace the transcriptionist, but make them more productive. To extend the analogy, the technology that operates on the original audio file to produce textual output would be the equivalent of machine translation.

What is the impact of productivity tools like Dragon on human transcriptionists and the overall transcription market? [Transcription is part of the broader document preparation services market, that is supposedly worth USD 5bn in the US alone. Fact is, Slator could not pin down a credible figure.] And is fully automated speech recognition technology about to replace human transcriptionists in the real world? Slator spoke to four professional transcriptionists for their take on the matter.

How to Train Your Dragon

One source we spoke to said there are still way too many variables for technology to be able to completely replace the entire human transcription process.

According to Belle Lapa, who founded Scriberspoint in 2012 after having worked for such companies as Lingo24 and S&P, “The way a speaker speaks, overlapping speakers, background noise and, perhaps, most importantly, the very variable nature of language itself” have yet to be factored in.

It just doesn’t make sense to keep deleting things if it’s that bad

In terms of voice input to increase productivity (i.e., listen to audio, repeat what you hear into mic instead of typing), transcriptionists today still need to train their speech recognition tools for them to be useful, in much the same way translators do translation productivity tools. Think adaptive machine translation and the need to personalize the system.

According to another professional transcriptionist to whom we spoke, using Dragon alone, untrained, yielded poor results. “It was simply unusable,” said the source, adding, “A human transcriptionist can achieve 80–90% accuracy with Dragon and, with the help of a real-time editor, can get it up to 93–96%.”

The source said that, to make a transcriptionist work faster, a tool has to reach 65–75% accuracy (“using our own in-house accuracy checking tool”). Anything below would be next to useless. “It just doesn’t make sense to keep deleting things if it’s that bad. It just slows up the typist,” our source pointed out.

The same source added that accuracy also depends on accents, naming as typical transcriptionist waterloos audio from speakers with Indian, Portuguese, and French accents.“Even when using Nuance’s Dragon, we still need humans to train the tool for context, nuance, homophones, proper nouns, accents, etc.,” our source said.

Here Too, Still a Long Way

Our source agreed to speak to us on condition of strict anonymity as professional transcriptionists are bound by stringent non-disclosure agreements. The source did disclose, however, that transcriptionists generally do not see their job as a “real career.”

Explained our source: “It’s probably what I’d call a high-paying, labor-intensive job. We are usually there to keep our bills paid — while we work to go somewhere else. Most of the best transcriptionists in our department have moved on to other things not transcription-related. Those who stayed go on to management. There’s just no real career growth there. What else can you do, really? Type faster? Speed-read?”

Asked if they feel, at all, threatened at being replaced by technology, our source said, “It’s not like we’d picket if the company suddenly announces a tool has replaced us. We’d hate losing a high-paying job, for sure. But, like I said, we don’t view transcription as a real career.”

Besides, said our source, technology still has a long way to go before achieving human-quality transcription — an on-the-ground assessment, despite The Economist saying that, “Thanks to deep learning, machines now nearly equal humans in transcription accuracy.”

It’s not like we’d picket if the company suddenly announces a tool has replaced us

Our source, a transcriptionist at a Fortune 500 company, said, “Anything purely done by tech is unusable. We did try, but the results were so bad. The best tools we had still needed human supervision and they are still being developed with the idea of having humans eventually edit the end result.”

For her part, Scriberspoint’s Belle Lapa said certain parts of a transcriptionist’s job have indeed been made easier by technology. “Foot pedals have replaced keyboard shortcuts and voice-recognition tools and software are able to capture speech and turn them into text, with varying degrees of accuracy,” according to Belle.

Belle is optimistic that speech-to-text technology can only get better with time. She said they now use “a very good dictionary,” glossary, and database tools, which help them in subjects like legal and medical where terms are highly specialized.

The Scriberspoint founder, who is based in the Philippines, benefitted greatly from what she described as “the last big boom in transcription” when the US required subtitles for the hearing impaired. Since then “the number of clients and their transcription needs have remained stable or moving upward, never downward.”

She regards India as their greatest competitor, “if only because of the cheaper rates they charge,” but said she is not worried, for now, about running out of clients.

Not so optimistic is “Carla” [last name withheld on request], who is also based in the Philippines and is now a content editor at a financial intelligence firm. Looking at the transcription industry from a distance, having left it “for a while now,” she would not call it “a hot market.”

She said, “Transcribing through typing is increasingly being replaced by voice transcription or captioning,” and sees technology replacing human transcription first taking place in captioning, subtitling, news, and medical.

Carla added, however, that if it involves voice transcription with near real-time editing, then there will be reasonable opportunities in financial, legal, and medical.

Transcribing through typing is increasingly being replaced by voice transcription or captioning

As for competitor India, she said the country may offer cheaper labor and tech, but “the Philippines would be in the higher tier if you were to factor in English listening skills, acquired typing skills, and adaptability.”

Prices Dropping

Much less optimistic is Noriel Ramientas II, who recalled that when he started doing home-based transcription, part-time, in February 2012, “everything was good — the pay was good, projects kept coming in, it was all just peachy. Since March 2015 though that hasn’t been the case.”

Fewer projects came in, he said, and the pay had reached a ceiling of USD 35 per audio hour, compared to five years ago when one could charge as high as USD 50 for the same audio length.

“I don’t mean to burst anyone’s bubble but I don’t think the transcription market — at least as far as home-based transcriptionists are concerned — is growing. That’s why I haven’t been doing it for more than a year now,” said Noriel.

He described other online jobs (blogging, virtual assistance, bookkeeping) as more readily available and higher paying, and the demand for transcriptionists, diminishing, “as more people, most notably those from India, flock to online job marketplaces like Upwork and Guru.”

As for tech replacing humans, Noriel’s view was pretty simple. He said it is just a tool. “Tools never function on their own. Someone sentient must put a tool to use before it is deemed useful.”

The experts seem to agree, especially as far as speech recognition technology and other transcription tools go. Although Dragon founder James Baker once said large vocabulary speech recognition was a solvable problem within his lifetime, more recently, Gerald Friedland said, “We used to joke that, depending who [sic] you ask, speech recognition is either solved or impossible. The truth is somewhere in between.”

Friedland, Audio and Multimedia Research Director at the UC Berkeley-affiliated International Computer Science Institute, was quoted in an April 2016 Wired article, called Why Our Crazy AI Still Sucks at Transcribing Speech.

Similar technological forces drive transcription and translation. And the lessons that transcription holds for translation are not new: Go niche, specialize, and embrace technology to increase productivity.

TAGS

Googlemachine translationMicrosoftneural machine translationtranscription
SHARE
Marion Marking

By Marion Marking

Slator consultant and corporate communications professional who enjoys exploring Asian cities.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Pro Guide: Sales and Marketing for Language Service Providers

Pro Guide: Sales and Marketing for Language Service Providers

by Slator

Press Releases

See all
Super Fast, Creative and Consistent: Supertext Launches Chat-Based Instant Translation Service

Super Fast, Creative and Consistent: Supertext Launches Chat-Based Instant Translation Service

by Supertext

Argos Multilingual Welcomes Kathleen Bostick as Localization Strategist and Senior Consultant

Argos Multilingual Welcomes Kathleen Bostick as Localization Strategist and Senior Consultant

by Argos Multilingual

Donna Thomas Joins Visual Data Media Services as Senior Vice President of Sales, Americas

Donna Thomas Joins Visual Data Media Services as Senior Vice President of Sales, Americas

by Visual Data Media Services

Upcoming Events

See All
  1. Memsource MT Post-Editing Pricing Models Webinar

    Pricing Models for MT Post-Editing Workshop

    by Memsource

    · February 3

    Hear a panel of innovative localization professionals share different approaches for MT post-editing pricing.

    More info FREE

Featured Companies

See all
Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

Seprotec

Seprotec

Versacom

Versacom

SDL

SDL

Smartling

Smartling

Lingotek

Lingotek

XTM International

XTM International

Smartcat

Smartcat

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Why Netflix Shut Down Its Translation Portal Hermes

Why Netflix Shut Down Its Translation Portal Hermes

by Esther Bond

The Slator 2020 Language Service Provider Index

The Slator 2020 Language Service Provider Index

by Slator

The Most Popular Language Industry Stories of 2020

The Most Popular Language Industry Stories of 2020

by Seyma Albarino

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,000 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.