Yasmin Moslem on Using Large Language Models to Custom Train Machine Translation

SlatorPod #130 - Machine Translation (MT) Researcher, Yasmin Moslem

Machine Translation (MT) Researcher, Yasmin Moslem, joins SlatorPod to talk about her research on Domain-Specific Text Generation for Machine Translation — a project she conducted with Rejwanul Haque, John D. Kelleher, and Andy Way at the Adapt Center in Dublin.

Yasmin shares her experience working as a translator, discovering translation productivity (CAT) tools, and experimenting with translation memory to improve MT. She breaks down the paper’s approach to domain-specific MT training using back-translation for data augmentation.

She discusses how some LSPs are already implementing this approach in real-life, customizing it for different use cases. She explains why they used a combination of BLEU, Comet, and other quality evaluation frameworks as well as human evaluation to rate machine translation quality.

Yasmin concludes the podcast with her advice for those in the core industry looking to enter the machine translation space, from the spiral learning process to reading research papers.

First up, Florian and Esther discuss the language industry news of the week, including how a streaming platform used propriety machine dubbing technology for its film offerings in the first quarter of 2022.

Over in London, TransPerfect acquired a virtual data room (VDR) tech company to proactively address the VDR market. In transcription news, VIQ Solutions’ shares dipped by 20% despite reporting strong, half-year revenue growth of 45% year on year. Meanwhile, multilingual captioning provider Ai-Media turns EBITDA-profitable as a 2021 acquisition drives revenue growth.