logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • SlatorCon Remote May 2021
    • Localizing at Scale for International Growth
    • Design Thinking May 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • SlatorCon Remote May 2021
    • Localizing at Scale for International Growth
    • Design Thinking May 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Register Now for SlatorCon Remote on May 13th!

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
Unlocking the Secrets of Language AI With Inter-language Vector Space

7 months ago

September 29, 2020

Unlocking the Secrets of Language AI With Inter-language Vector Space

Sponsored Content ·

by XTM International

On September 29, 2020

7 months ago
Sponsored Content ·

by XTM International

On September 29, 2020

Unlocking the Secrets of Language AI With Inter-language Vector Space

Just as neural machine translation (NMT) disrupted statistical machine translation, so has Inter-Language Vector Space (ILVS) disrupted current word alignment models. Moreover, the adoption of ILVS comes with considerable commercial implications for translation and localization workflows in terms of time savings, reduced cost, improved accuracy, and brand consistency.

ILVS is neural network-based technology that indicates the proximity between distinct source and target words within a segment being translated. It is now being used to enhance productivity for translators, reviewers, and post-editors.

“ILVS allows word-to-word alignment to be done online very quickly. In most cases, it takes well under a second,” Dr. Rafał Jaworski told Slator. Jaworski is XTM International’s Linguistic AI Expert.

Advertisement

ILVS is built on extensive research by Google and Facebook (i.e., around vector space algorithms), and XTM provided data from bilingual dictionaries and performed the alignment of vector spaces.

Although vector space algorithms have been widely researched and used by Big Tech for quite some time, Jaworski pointed out that, “to the best of our knowledge, this technology has not been used directly to aid the human translation process. The inter-language aspect of ILVS makes that possible.”

“A fascinating thing about ILVS is that it is able to detect potential translation candidates even if they have never appeared in any dictionary”

He said XTM adopted this new technology and immediately put it into action by developing and releasing multiple features based on it; and, “once proven to be useful, these novel features will likely inspire others to follow.”

Jaworski added: “A fascinating thing about ILVS is that it is able to detect potential translation candidates even if they have never appeared in any dictionary. This is achieved from the alignment of vector spaces — and this process affects all words in the vector space, not only those that appear in dictionaries. For this reason, when performing the task of building multilingual terminology glossaries, for instance, ILVS can detect even highly specialized narrow-domain terms.”

Why It Is Disruptive

According to XTM’s Linguistic AI Expert, Dr. Rafał Jaworski, current known mechanisms for performing translation alignments (e.g., Giza++, FastAlign) work only in batch mode; that is, they process the whole bilingual corpus in one go, which takes a considerable amount of time. 

Rafał Jaworski of XTM on Inter-Language Vector Space

Furthermore, current systems do not offer to perform word alignment between a single pair of sentences using data from the whole corpus. By contrast, ILVS provides a way to immediately calculate the probability of word alignments. “ILVS returns the probability of alignment as opposed to Giza++ or FastAlign, which only provide the binary information: either the word matches or it does not,” Jaworski said.

He added that this information from ILVS on alignment probabilities “opens up countless opportunities, such as the identification of potential translation errors (i.e., words in translated text with low matching probability to source words), translation suggestions, and much more.”

Who Can Immediately Benefit

ILVS was first introduced in version 12.4 of XTM Cloud, the company’s flagship translation management system (TMS) with integrated translation productivity (a.k.a. CAT) tool.

Additional features include automatic placement of inline elements, further improvement to XTM’s already class-beating, auto-alignment corpus aligner and bilingual terminology extraction.

Users of XTM Cloud 12.4 can immediately benefit from ILVS at no added cost and can expect upcoming versions 12.5 and 12.6 to come with further improvements to the technology as well as higher language coverage. The features involved include enhanced auto-alignment, auto-inline element placement, and bilingual terminology extraction.

While word- and phrase-level alignment was available in previous TMS versions, according to Jaworski, “it was only powered by electronic bilingual dictionaries. Now, ILVS provides much higher coverage in terms of the number of supported languages and the number of words within each language.”

“Not a single bit of information came from the private data or material of XTM’s clients”

The technology draws on massive big data resources — including a crawl of all of the Internet and XTM’s massive bilingual dictionaries — to calculate the probability of a given target language word being the correct translation of a source word for over 250 languages.

Jaworski explained, “When we speak about 250 languages, that makes, combinatorially, 31,125 language pairs (i.e., 250 × 249/2). Before ILVS, we supported about 200 languages, which makes 19,900 language pairs.”

The data used to create ILVS came from texts publicly available on the Internet and Big Data bilingual dictionaries licensed by XTM. Jaworski emphasized: “Not a single bit of information came from the private data or material of XTM’s clients. Moreover, even the publicly available texts are not stored within ILVS. The only information that can be retrieved from ILVS is the numerical translation probability.”

Commercial Impact

A critical part of the translation / localization process is terminology management. Building terminology from existing translations is crucial to text quality and consistency — and automating the extraction of bilingual terminology is the next level of advancement.

ILVS can automate up to 90% of bilingual term extraction.

XTM used advances in computational linguistic technology including ILVS to build a reliable bilingual terminology extraction feature. Linguistic AI Expert Jaworski explained: “Automatic glossary creation first detects terminology on the source side of the translation memory. ILVS then helps find the translation of these terms. The human input merely consists of reviewing the output.”

The impact is fourfold: (1) time savings – it takes 85% less time to create glossaries; (2) reduced cost – consistent terminology means less rework and no extra costs; (3) improved quality – up to 90% accuracy based on high-quality translation memory; (4) brand consistency – resulting glossaries can now ensure consistent style across content.For more information, visit www.xtm.cloud/artificial-intelligence.

TAGS

ILVSInter-Language Vector Spacevector spacesvectorsXTMXTM International
SHARE
XTM International

By XTM International

XTM International develops translation management system XTM, available via the cloud or on your own servers.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Pro Guide: Translation Pricing and Procurement

Pro Guide: Translation Pricing and Procurement

by Slator

Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Press Releases

See all
Protranslate Continues its Substantial Growth in 2021 with its New Enterprise Services

Protranslate Continues its Substantial Growth in 2021 with its New Enterprise Services

by Protranslate

lexiQA Celebrating 5 Years of Continuous Growth

lexiQA Celebrating 5 Years of Continuous Growth

by lexiQA

GET IT Consolidates its Agreement with XTRF to Foster Growth and Ensure Business Excellence

GET IT Consolidates its Agreement with XTRF to Foster Growth and Ensure Business Excellence

by XTRF

Upcoming Events

See All
  1. SlatorCon Remote May 2021

    by Slator

    · May 13 @ 3:00 pm - 8:00 pm

    A rich online conference which brings together our research and network of industry leaders.

    More info $110

Featured Companies

See all
Sunyu Transphere

Sunyu Transphere

Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

SeproTec

SeproTec

Versacom

Versacom

Smartling

Smartling

XTM International

XTM International

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,800 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.