A Panel Most Unique: Verbit, KUDO, and Deluxe on Video Localization’s Convergence

SlatorCon Remote: Verbit, KUDO, and Deluxe on Video Localization

The SlatorCon Remote audience would have been forgiven for thinking the Video Localization panelists — Tom Livne (CEO and Founder, Verbit), Fardad Zabetian (CEO and Co-founder, KUDO), and Chris Reynolds (EVP and Managing Director, Worldwide Localization and Fulfillment, Deluxe) — share no common ground.

There is no denying that the three companies represented on the panel serve a diverse customer base. Verbit focuses on education, media, and legal clients; KUDO on government, international organizations, and B2B customers; while Deluxe is a titan of the media and entertainment space.

They also provide a range of services: while KUDO facilitates multilingual web conferencing through remote simultaneous interpreting (RSI), Deluxe provides subtitling and dubbing for Hollywood studios and streaming platforms, and Verbit offers transcription and live translated captioning to court reporting agencies, broadcasters, and e-learning providers.

What unifies them is video — specifically video localization — a market that Slator sized at USD 5bn in 2021. Video localization is defined as the market for subtitling, dubbing, remote simultaneous interpreting (RSI), and live translated captioning as used for media and entertainment content, training and education purposes, and facilitating meetings and events. 

Slator 2021 Video Localization Report

Slator 2021 Video Localization Report

45-pages on subtitling, dubbing, RSI, and captioning for media & entertainment, training & education, meetings & events.
$590 BUY NOW

Exploring the similarities within the market for video localization services, Deluxe’s Reynolds said there is overlap to be found between B2B and media entertainment customers in “news, sports, and documentaries, where factual and accurate translation is key.” However, immersive entertainment content diverges from corporate, he added, “where it’s really about the audience being lost in the story and the creative intent of the piece.”

Speech vs. Text, Live vs. Pre-recorded

The panel also traded views on the use of speech-based video localization services — such as dubbing and RSI — and text-based video localization services — such as subtitling and live translated captioning.

Reynolds said, “the preferences of the given country is the number one factor” in a customer’s decision to add dubbing to video content or not, but “there’s never a language that isn’t subtitled.” The type of content also plays a role, he explained. For example, animated content is dubbed into more languages, “simply because younger viewers can’t read.”

Asked how the use of speech versus text services plays out in the live events space, Zabetian said “you can never have enough efforts [for] inclusion” and noted that KUDO has “just rolled out our live captioning in addition to live interpretation.”

Livne also offered his take on the differences between real-time (live) and pre-recorded settings, observing that “each has its own challenges.” He identified the risk of freelance transcribers not showing up as a consideration when managing real-time captioning, along with the need to “maintain very high accuracy,” and the challenge of delivering at scale in multiple languages.

Automation in Video Localization

Turning to automation and technology, the panelists shared their views on current adoption levels and the future promise in video localization workflows, with Livne noting that fully-automated captioning solutions are “still not at the professional levels because AI today is still not good enough.”

Zabetian’s assessment of speech translation is that it is still in the very early stages: “While there is a lot of focus and large investments from big tech companies in the space, the [right level of] quality and accuracy is still a few years ahead of us.”

He added, “Specifically, when it comes to live interpretation during video meetings, the accuracy of a human interpreter is unbeatable.”

On the topic of voice synthesis, Reynolds said that although it is attracting “a lot of interest,” its adoption for premium content will take time because of how highly supervised the process can be — directors sometimes go as far as handpicking translators and reviewing the translation against their own scripts, he added.

Rounding out the discussion, the panelists fielded questions from the audience around the nature of the transcription market, the use of AI training data, and trends in pricing. 

Watch the full Video Localization panel discussion and the entire SlatorCon Remote September 2021 event on demand, here.