logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Advertise on Slator! Download the 2021 Online Media Kit Now

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
A Warning About Data Annotation as a Business From a Man Who Sells AI to AI Firms

1 year ago

December 5, 2019

A Warning About Data Annotation as a Business From a Man Who Sells AI to AI Firms

SlatorCon ·

by Seyma Albarino

On December 5, 2019

1 year ago
SlatorCon ·

by Seyma Albarino

On December 5, 2019

A Warning About Data Annotation as a Business From a Man Who Sells AI to AI Firms

“When it comes to delivering value, none of the algorithms will deliver value without data. Data trumps algorithms every time,” Acrolinx Founder Andrew Bredenkamp told the SlatorCon Amsterdam 2019 audience. Acrolinx is an AI-powered content management platform that uses natural language processing (NLP) technology on a large scale for clients across a range of industries.

Quoting what, by now, has become a widely accepted statement, “Data is the new oil,” Bredenkamp explained how, along with data comes the massive data annotation industry. Valued at USD 0.5bn in 2018, Bredenkamp said data labeling will more than double in the next four years. An August 2019 article in The New York Times estimated that data annotation accounts for 80% of the time spent building AI technology.

Little wonder, then, that some language service providers (LSPs) are eager to grab a piece of the data annotation market, which companies like Appen have historically dominated and where well-funded startups like Scale are competing.

Advertisement

Bredenkamp cautioned that, in his view, companies providing such middlemen services are starting to get squeezed: Individual annotators have become more organized and have begun advertising their services in the freelance data market, and data requirements are actually falling.

“What the AI community is working really hard on is removing this (data annotation) bottleneck to building applications,” Bredenkamp said. “Obviously, if you build algorithms, the last thing you want to do is spend 80% of your time doing something else.”

Several solutions have already emerged as potential alternatives to data annotation.

In synthetic labeling, two AI systems interact with and test each other to artificially create and label content, a process that Bredenkamp said is starting to look similar to what humans do.

Slator 2020 Language Industry Market Report

Data and Research, Slator reports
55 pages. Total market size, biz dev and sales insights, TMS & MT review, buyer segment analysis, M&A, Covid impact & outlook.
$480 BUY NOW

Facebook and other groups have used unsupervised approaches to building AI applications. One method that has been used to build MT systems is merging huge, independent language datasets without parallel language data and allowing a system to see how vectors start to cluster. The system then “learns” how words behave in context.

Content robots have also been able to generate text in “simple” areas. Bredenkamp pointed out, however, that sentences may sound plausible when read individually, but the text generally lacks coherence as a whole.

Andrew Bredenkamp, Acrolinx

This shortcoming echoes another challenge in AI: the significant gap between merely seeing a correlation and tapping actual, real-world knowledge.

According to Bredenkamp, AI has struggled with disambiguation (i.e., interpreting linguistic ambiguity) for the past 50 years. Still, there have been recent gains: Vector composition has helped AI learn word relationships that three or four years ago were impossible — and big tech companies have sponsored initiatives supporting research on building knowledge for AI systems.

“If you build algorithms, the last thing you want to do is spend 80% of your time doing something else” — Andrew Bredenkamp, Founder, Acrolinx

“The open question is: Will AI be able to learn knowledge like humans learn knowledge, by observation of data?” Bredenkamp asked the audience.

LSPs in an AI World

Commenting more specifically on the language services industry, Bredenkamp formulated a view on how LSPs may want to position themselves.

To gain a significant competitive advantage, Bredenkamp said, they have to adopt a hyperlocal strategy and look beyond the small set of languages many LSPs have worked with traditionally.

An August 2019 KPMG report, for example, found that non-English use of the Internet is growing six times as fast as English use, and a November 2019 article by The Economist described how China is waking up to the commercial value of dialects.

Slator RFP Service - Request for Proposal

RFP Center

Business Development, Market Intelligence
Receive daily email alerts of tenders and RFPs issued by governments, NGOs and private entities from across the world.
BUY NOW

Bredenkamp said, “Increasingly, brands are translating the user experience into those languages (e.g., dialects) to drive better market penetration in the regions, and these are not small regions. They’re targeting hundreds of millions of users.”

The Acrolinx Founder said he also expects human translation to become rarer for well-resourced language pairs due to high quality systems like DeepL and Google Translate. He said he has seen big organizations completely move away from human translation for certain types of content.

The hope then lies in transcreation, which is still far from being automated.

Bredenkamp, a self-described optimist, rejects the fear-mongering that often comes with predictions about AI, and even welcomes the possibility of AI surpassing the abilities of the world’s smartest people.

“Why should it stop there? We aren’t the limit,” Bredenkamp said. “I think this will be good for us. Let’s all get used to it. It will be a brave new world.”

TAGS

AcrolinxAIAndrew BredenkampannotationAppenartificial intelligencedatadata annotationdata collectiondata labelingDeepLFacebookKPMGLanguage service providersLSPsmachine learningmachine translationMTnatural language processingneural machine translationNLPNMTSlatorConSlatorCon Amsterdam 2019technologytext generationThe EconomistThe New York Times
SHARE
Seyma Albarino

By Seyma Albarino

Staff Writer at Slator. Linguist, music blogger and reader of all things dystopian. Based in Chicago after adventures on three continents.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Pro Guide: Sales and Marketing for Language Service Providers

Pro Guide: Sales and Marketing for Language Service Providers

by Slator

Press Releases

See all
XTRF Launches a Bi-Monthly Free Networking Event for Localization Professionals

XTRF Launches a Bi-Monthly Free Networking Event for Localization Professionals

by XTRF

150 Million Words Translated: the German EU Council Presidency Translator Sets New Records

150 Million Words Translated: the German EU Council Presidency Translator Sets New Records

by Tilde

BeLazy Announces Full Automation for Plunet

BeLazy Announces Full Automation for Plunet

by BeLazy

Upcoming Events

See All
  1. Memsource MT Post-Editing Pricing Models Webinar

    Pricing Models for MT Post-Editing Workshop

    by Memsource

    · February 3

    Hear a panel of innovative localization professionals share different approaches for MT post-editing pricing.

    More info FREE

Featured Companies

See all
Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

Seprotec

Seprotec

Versacom

Versacom

SDL

SDL

Smartling

Smartling

Lingotek

Lingotek

XTM International

XTM International

Smartcat

Smartcat

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Why Netflix Shut Down Its Translation Portal Hermes

Why Netflix Shut Down Its Translation Portal Hermes

by Esther Bond

The Slator 2020 Language Service Provider Index

The Slator 2020 Language Service Provider Index

by Slator

Top Language Industry Quotes of 2020

Top Language Industry Quotes of 2020

by Monica Jamieson

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,000 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.