The rapid pace of research on all matters machine translation continues unabated on Cornell University’s preprint server, arXiv. Studies range from topics of purely academic interest to advances promising practical improvements to real-life applications.
In preparation for the (virtual) conference season in the latter part of the year, the September–October period tends to be a busy one for research output. While authors hail from a variety of institutions and organizations, research out of US and China Big Tech has risen steadily.
For Silicon Valley’s tech giants, the papers are simply the latest in a stream of continuously accelerating research related to machine translation (MT). Apple ramped up its MT research in preparation for the September 2020 release of its Translate app. Facebook open-sourced CoVoST V2 in July 2020, a “massively multilingual” speech-to-text (STT) translation dataset. And Google’s Łukasz Kaiser contributed to a September 2020 paper that claimed MT now outperforms human translation.
Meanwhile, Chinese companies, such as Alibaba, WeChat owner Tencent, and TikTok parent ByteDance, have released their own studies in line with the Chinese government’s three-year action plan, issued in 2018, to advance the country’s AI technology, including speech recognition and MT.
SlatorCon Remote June 2023 | Early Bird Now $120
A rich online conference which brings together our research and network of industry leaders.
This flurry of machine translation research takes place against the backdrop of a US-China trade war. China updated its restricted-export list in late August 2020 to include more language technologies, and companies must now seek approval from Beijing before exporting such products.
Western critics were already wary of Chinese MT offerings due to security issues; such as those raised in an Australian thinktank’s October 2019 report claiming that China uses state-owned companies, which provide MT services, to collect data on users outside China. (When Microsoft announced in September 2020 that ByteDance declined to sell TikTok’s US operations to Microsoft, it noted that, had the sale gone through, Microsoft would have made “significant changes” to the service to protect user privacy.)
Politics aside, the articles reflect some trends in MT research, such as the growing interest in STT translation solutions, as evidenced by Facebook’s fairseq S2T and ByteDance’s TED and SDST. (Slator covered the Chinese government’s investment in speech recognition back in 2019.) Two of Google’s articles explore the potential of MT for low-resource languages. The concept of inference also features in two papers, one by Google, the other by Apple.
Uncertainty-Aware Semantic Augmentation for Neural Machine Translation – As every translator knows, there are multiple valid translations for any given source text. In NMT, this concept is called “intrinsic uncertainty.” Researchers built a network that does not penalize the use of accurate synonyms and found MT performance improved consistently across language pairs.
Self-Paced Learning for Neural Machine Translation – Add this paper to the canon of research implying that MT can beat humans at their own game. An NMT engine was improved via “self-paced learning,” which mimics the human language learning process.
Efficient Inference for Neural Machine Translation – Diving deep into the inner workings of NMT, this study explored the ideal combination of techniques to optimize inference speed in large Transformer models without sacrificing translation quality.
Generative Imagination Elevates Machine Translation – With a title that, at first glance, evokes a certain Noam Chomsky sentence, this paper details the use of an “imagination-based MT model.” ImagiT synthesizes visual representations based on source text rather than relying on annotated images as input, which reportedly improved translation quality.
TED: Triple Supervision Decouples End-to-End Speech-to-Text Translation – Traditional cascaded speech translation systems are slow and can introduce content errors. The TED framework, designed to imitate how humans process audio information, aims to avoid these issues by using separately trained subsystems for auto speech recognition and for MT.
SDST: Successive Decoding for Speech-to-Text Translation – In response to the open question of whether end-to-end or cascaded models are stronger, the authors suggested that their framework offers the best of both worlds; and stated their plans to make their model and code publicly available.
fairseq S2T: Fast Speech-to-Text Modeling with fairseq – Facebook’s fairseq S2T extension provides end-to-end speech recognition and STT translation. Documentation and examples are available on GitHub.
KoBE: Knowledge-Based Machine Translation Evaluation – Seeking a method by which to evaluate MT without reference translations, researchers created and released a large-scale knowledge base across 18 language pairs. The authors described their process as language-pair agnostic, noted that synonyms in MT output should not be penalized, and expressed interest in scaling the method to much larger or domain-specific datasets.
Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages – Low-resource languages often lack the large amounts of relevant parallel data and high-quality monolingual data necessary for state-of-the-art MT results. A three-stage training plan that incorporates synthetic data and higher-resource languages as pivot languages outperformed all unsupervised baselines and surpassed a variety of WMT submissions.
Inference Strategies for Machine Translation With Conditional Masking – What is best inference strategy for a trained conditional masked language model? Researchers found that disallowing the re-masking of previously unmasked tokens resulted in “favorable quality-to-speed trade-offs.”
Learning to Evaluate Translation Beyond English: BLEURT Submissions to the WMT Metrics 2020 Shared Task – As part of a contribution to WMT 2020 Metrics Shared Task, the main benchmark for automatic evaluation MT, the previously published metric BLEURT was extended beyond English to evaluate 14 language pairs with fine-tuning data available, plus four zero-shot languages.
Token-Level Adaptive Training for Neural Machine Translation – NMT encounters a range of learning difficulties with different tokens based on how frequently different words appear in natural language. Assigning larger weights to meaningful low-frequency words during training yielded consistent improvements in translation quality for Chinese to English, English to Romanian, and English to German.