logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Advertise on Slator! Download the 2021 Online Media Kit Now

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
TikTok Parent Company ByteDance Open Sources Neural Speech Translation Toolkit

3 weeks ago

January 7, 2021

TikTok Parent Company ByteDance Open Sources Neural Speech Translation Toolkit

Machine Translation ·

by Seyma Albarino

On January 7, 2021

3 weeks ago
Machine Translation ·

by Seyma Albarino

On January 7, 2021

TikTok Parent Company ByteDance Open Sources Neural Speech Translation Toolkit

Chinese technology company ByteDance, known for its short-form, video-sharing app TikTok, has released an open-source toolkit for neural speech translation.

According to a December 2020 paper published on pre-print server arXiv.org, “NeurST aims at facilitating the speech translation research for NLP researchers and provides a complete setup for speech translation benchmarks, including feature extraction, data preprocessing, distributed training, and evaluation.”

One of the end goals, of course, is to extend the toolkit’s use to “advanced speech translation research and products.” NeurST is currently publicly available on GitHub.

Advertisement

Recent work by the three authors — Chengqi Zhao, Mingxuan Wang, and Lei Li, all of ByteDance — includes PRUNE-TUNE, a new method of domain adaptation for machine translation (MT) domain adaptation, and multi-resolutional (MR) Doc2Doc, which the researchers used to train a neural sequence-to-sequence MT model for document-level translation.

Pro Guide Sales and Marketing for Language Service Provider and Translation and Localization Companies (Product)

Pro Guide: Sales and Marketing for Language Service Providers

Data and Research, Slator reports
36 pages. How LSPs generate leads, hire and compensate Sales staff, succeed in Digital Marketing, and benchmark against rivals.
$260 BUY NOW

As the paper explained, one of the shortcomings of traditional “cascade” speech translation systems is that mistakes in transcription — typically powered by automatic speech recognition — can cause errors in translation. End-to-end speech translation, on the other hand, bypasses the transcription step and produces less lag time.

The authors noted that studies on speech translation work on different datasets; their goal, therefore, was to establish reproducible and reliable benchmarks for the field. They said that NeurST’s “straightforward recipes for preprocessing audio datasets” will free up developers for more advanced work on speech translation.

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

Data and Research, Slator reports
44-pages on how LSPs enter and scale in AI Data-as-a-service. Market overview, AI use cases, platforms, case studies, sales insights.
$380 BUY NOW

NeurST was put to the test on several benchmark speech translation tasks for eight European language pairs using publicly available speech translation data (namely the Augmented LibriSpeech and MuST-C corpora).

Overall, NeurST outperformed existing counterparts Espnet-ST and fairseq-ST in most languages. The authors hope the toolkit, which is meant to be NLP researcher-friendly, will be used to establish baselines in future studies.

TAGS

arXivByteDanceChengqi ZhaoChinaLei LiMingxuan WangNeurSTTikTok
SHARE
Seyma Albarino

By Seyma Albarino

Staff Writer at Slator. Linguist, music blogger and reader of all things dystopian. Based in Chicago after adventures on three continents.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Pro Guide: Sales and Marketing for Language Service Providers

Pro Guide: Sales and Marketing for Language Service Providers

by Slator

Press Releases

See all
iDISC Awarded ISO 27001 Information Security Management Certification

iDISC Awarded ISO 27001 Information Security Management Certification

by iDISC

XTRF Launches a Bi-Monthly Free Networking Event for Localization Professionals

XTRF Launches a Bi-Monthly Free Networking Event for Localization Professionals

by XTRF

150 Million Words Translated: the German EU Council Presidency Translator Sets New Records

150 Million Words Translated: the German EU Council Presidency Translator Sets New Records

by Tilde

Upcoming Events

See All
  1. Memsource MT Post-Editing Pricing Models Webinar

    Pricing Models for MT Post-Editing Workshop

    by Memsource

    · February 3

    Hear a panel of innovative localization professionals share different approaches for MT post-editing pricing.

    More info FREE

Featured Companies

See all
Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

Seprotec

Seprotec

Versacom

Versacom

SDL

SDL

Smartling

Smartling

Lingotek

Lingotek

XTM International

XTM International

Smartcat

Smartcat

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Why Netflix Shut Down Its Translation Portal Hermes

Why Netflix Shut Down Its Translation Portal Hermes

by Esther Bond

The Slator 2020 Language Service Provider Index

The Slator 2020 Language Service Provider Index

by Slator

Top Language Industry Quotes of 2020

Top Language Industry Quotes of 2020

by Monica Jamieson

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,000 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.