This Is How Automatic Speech Recognition & Machine Translation Are Revolutionizing Subtitling

AppTek ASR MT subtitling

Subtitling has relied on template-based workflows for more than two decades. The use of templates (a.k.a. subtitle files) has been called “one of the greatest innovations in the subtitling industry at the turn of the century.”

Today, two cornerstone technologies are making an impact on the subtitling market and, once more, changing the way workflows are run: automatic speech recognition (ASR) and machine translation (MT).

Historically, a subtitling task comprises two steps, transcription and translation. In the transcription step, a template in the language of the original audio is created. If the original audio is in English, then this would only require a time-coded transcription of the dialogue. If the audio is in a language other than English (LOTE), a typical approach would be to transcribe the LOTE audio and then translate it into an English subtitle file.

The English template is then translated into all the target languages required by the project, while adhering to space and reading-speed parameters. However, recent advancements in ASR and MT have transformed these once time-consuming steps of transcription and translation into more efficient post-editing tasks.

Moreover, training an ASR and / or MT engine on subtitling data can ensure better quality subtitles. This is what ASR and MT tech provider, AppTek, specializes in.

According to AppTek Managing Director, Dr Volker Steinbiss, “Our AI engines have been trained specifically on subtitling data from massive libraries of transcribed and translated media subtitle files as well as other language data. In fact, AppTek is one of the few providers of both ASR and MT technology built ground-up inside the company’s own platform, whose engines have been vetted by the world’s leading language service providers for the media vertical.”

And, of course, as more companies look to deploy customizable ASR and MT technologies into their workflows, privacy of client data remains a top concern.

“Our AI engines have been trained specifically on subtitling data from massive libraries of […] media subtitle files as well as other language data.” – Dr Volker Steinbiss, Managing Director, AppTek

Asked how their AI systems are protected in terms of security when training on existing video content libraries, AppTek CTO Steve Cook explained, “In addition to our ASR and MT engines, which can be deployed in secure on-premise environments, we provide the ability for the end-user to customize their models using their own media libraries — which remain protected on-site and keep the ensuing models private to them.”

Commercial Impact of ASR on Subtitling Market

Research has shown that using ASR results in subtitling productivity gains; and that commercially available ASR tools are, in general, both cost-effective and secure. “Even the tool that performed most poorly still saved some time. And the one that performed better saved an impressive 46 minutes,” wrote Mara Campbell, ATA-certified linguist and founder of Latin American subtitle provider, True Subtitles.

“We provide the ability for the end-user to customize their models using their own media libraries — which remain protected on-site and keep the ensuing models private to them” — Steve Cook, CTO, AppTek

She also noted the robust security and competitive pricing of ASR tools available in the market. Since Campbell’s study was published in 2019, ASR has naturally advanced even more on all fronts.

As proof of ASR’s commercial impact, one company that has partnered with AppTek is Super Agency TransPerfect, ranked #1 on the Slator 2021 LSPI. TransPerfect recently deployed AppTek’s ASR technology to accelerate subtitling workflows by creating a first-pass subtitle file to work from, either from a provided script or directly from audio.

How Machine Translation Was Specialized for Subtitles

At the world’s biggest conference on Machine Translation, WMT, back in 2019, AppTek shared its research on customizing MT for subtitling with the scientific community, highlighting two new developments, as follows:

  • Intelligent Line Segmentation (ILS) – an AppTek tool based on what researchers described at the time as “a novel subtitle segmentation algorithm that predicts the end of a subtitle line given the previous word-level context using a recurrent neural network learned from human segmentation decisions.” Prior to ILS, all other approaches to segmentation had been very manual.
  • Subtitle Edit Rate – this metric accurately captures the edit distance in post-edited subtitles, including line-break errors, something previous metrics did not account for.

The result is an MT system highly specialized for subtitle translation. At the time of the study, the system already resulted in productivity gains of up to 37% compared to previous methods. Check out this video that explains how the AppTek system works.

Experts Test Drive AppTek’s MT for Subtitles

Subtitler and Technical Translator, Damián Santilli, ran parts of the popular Mexican telenovela, Te doy la vida (I give you life), through both AppTek’s subtitling-specialized MT and Google’s state-of-the-art, general-purpose MT.

The test was meant to assess how “specialized” AppTek’s tool truly is for translating subtitles — in this case for a language pair (i.e., Latin American Spanish into English) in high commercial demand in entertainment media.

Santilli’s candid and transparent assessment (he fully disclosed how he had “been collaborating with AppTek in recent months”) showed that AppTek performed better than Google 57% of the time based on machine translating over 800 subtitles, and 25.9% vice-versa; and both did about the same or “were useless” in the remaining 17.1% of subtitles.

According to Santilli, “AppTek did a better job” based on a test that evaluated four factors: segmentation, translation of names and places, formal / informal treatment, and miscellaneous cases.

“As for my choice of MT service, I think it’s obvious that AppTek did a better job, and I believe that’s mostly because it is an MT specialized in subtitling” — Damián Santilli, Subtitler and Technical Translator

Meanwhile, Media Localization Specialist, Stavroula Sokoli, published a study in the June 2021 ATA Deep Focus newsletter that showed the results of a test using the TV romcom, Christmas Wedding Planner, and using the SubtitleNEXT platform where AppTek’s MT had been deployed.

Sokoli wrote, “I was able to use almost one third of the proposed [AppTek] MT subtitles, i.e., 393 out of 1,225 subtitles, without modifying them. Many of them were simple greetings, negations or affirmations, but there were pleasant surprises such as the successful treatment of omission” when translating from English into Greek.

Additionally, Sokoli was also able to use “a bit more than one third” with minor post-editing, such as “changes involving a few key-strokes, like deleting text and punctuation, editing a single word, or adding a space.”

“I was able to use almost one third of the proposed [AppTek] MT subtitles, i.e., 393 out of 1,225 subtitles, without modifying them” — Stavroula Sokoli, Media Localization Specialist

AppTek’s subtitling specialized ASR and MT models are now integrated in subtitling platforms widely used in the video localization and media entertainment industries, both proprietary and commercially available ones, such as OOONA, Stellar, and SubtitleNEXT.

Relevant customization of the user interface of subtitle editors will, of course, make a significant difference to the extent that translators will be able to benefit from the capabilities of the MT integrated into the tool, as Sokoli pointed out in her article.

Want to perform your own test drive of AppTek ASR and MT? Contact for details or visit this link to sign up for a free trial today.