Machine translation (MT) and media localization startup XL8 announced on July 11, 2023, that it had acquired American High-Tech Transcription and Reporting, Inc. (AHT), a company that specializes in stenography, transcription, and translation for US government-related clients.
According to the press release, XL8 is “diversifying and growing its customer base beyond its media and entertainment roots,” setting its sights on new revenue streams and markets.
The terms of the deal were not disclosed, though XL8 CEO Tim Jung — now also CEO of AHT — told Slator that the husband and wife team that founded and built AHT will stay on to help facilitate the transition.
“The AHT name, which is so recognizable throughout the Government network in which they operate, will stay in place as well,” Jung said, adding that XL8’s CRO Josh Pine will take on the additional role of President of AHT.
On the face of it, XL8 and AHT do not have much in common. AHT, in business for over 30 years, works on court hearings, depositions, and law enforcement interrogations — worlds apart from XL8’s clients, often language service providers who serve up content to over-the-top media giants, as well as high-volume content creators. But Jung said this is no contradiction for XL8.
“While [media and entertainment] gave us our start, and we are extremely passionate about M&E, we have always intended to bring our technology to other industries,” he explained. “Government is just our first foray outside of M&E, and at some point down the road, we will apply our technology to other industries as well.”
XL8 offers two core products. MediaCAT streamlines media localization workflows into three steps: sync (speech recognition helps generate subtitles and time codes); translate (tapping into media-specific content training data); and dub.
Need for XL8’s second product, EventCAT, surged during the Covid-19 pandemic as events worldwide went remote. Interpreters can log into an online system to interpret, with or without the help of AI. Building on remote simultaneous interpreting, XL8 also offers speech translation for real-time events.
Similarities Behind the Scenes
What connects — or, rather, will connect — XL8 and AHT is the technology and strategy behind the services offered.
“When we decided to try a government sector PoC [proof of concept] by running some sample interviews from AHT on MediaCAT’s Sync, the accuracy was ~95%,” Jung said. “This made us confident that we could both extend our technology to the Government sector, and help AHT’s business scale using XL8’s AI toolset.”
More specifically, XL8 plans to work with AHT’s human transcribers to optimize a speech-to-text (STT) engine, which will in turn improve MT performance.
The end goal is a more comprehensive STT engine based on real speech, with input edited by AHT’s professionals, which can contribute to higher-quality real-time subtitles, transcripts, and translations.
“Typically, stenographers in this field can type at 200+ words per minute,” Jung added. “We can now enable these stenographers to very quickly sync the transcripts they are typing at lightning speed to a video or audio source and instantaneously generate files with timecodes and speaker names.”
The human-in-the-loop workflow is standard for XL8, starting with its MT engine, which is fed only human-created media content.
Jung described to Slator how one “crucial partner” uses the company’s MT: Iyuno-SDI professionals correct the initial MT output, which is then fed back into the training pipeline, which constantly improves XL8’s engines and services specifically for Iyuno-SDI.
The acquisition follows an August 2022 funding round in which XL8, founded in 2019, raised USD 3m. At the time Jung said the funds would go toward the development of integrated tools and AI products for media localization and live translation services.
“One of the technical challenges in live translation is speech recognition. If the speech recognition is incorrect for various reasons, e.g., domain specificity, accents, speed, or noise, the final translation results could be drastically wrong,” Jung told Slator. “Recent developments to address this problem include a hybrid approach where human stenographers perform real time transcription, or edit speech-to-text recognition in real-time, and then the transcript gets translated into multiple languages, also in real time.”
As of September 2022, plans for XL8’s tech included adding more enterprise features, such as project management and batch processing, to MediaCAT, plus target context awareness to the UI so users can post-edit and see translations improve for the next sentence, in real-time.