Tencent AI Introduces an LLM-Based Virtual LSP for Literary Translation

Multi-Agent Collaboration for Translating Ultra-Long Literary Texts

In a May 20, 2024 work-in-progress paper researchers from Tencent AI Lab, Monash University, and the University of Macau introduced a novel approach aimed at addressing one of the “most challenging” tasks for machine translation (MT): translating literary texts.

Literary texts are rich in complex language, figurative expressions, cultural nuances, and unique stylistic elements that are difficult for machines to grasp. “This complexity makes literary translation one of the most challenging areas within machine translation,” they said.

To address this challenge, the researchers leveraged the “superior capabilities” of multi-agent systems and established a “novel multi-agent virtual company” based on large language models (LLMs) for translating literary texts, called TRANSAGENTS.

TRANSAGENTS simulates a translation agency with various agents in distinct roles, mirroring those found in a human translation team. The team includes roles such as senior editors, junior editors, translators, localization specialists, and proofreaders. All agents are GPT-4 and they were given detailed profiles and worked together in a multi-stage process to create, review, and improve translations.

To make the translation process simulation more realistic and effective, the researchers used GPT-4 to create 30 detailed virtual agent profiles for each role. These profiles encompass various attributes beyond language skills, such as gender, nationality, rate per word, education, experience, and specialization.

“This detailed and personalized approach not only enriches the authenticity of the translation process simulation but also mirrors the complexity and diversity found in real-world translation settings,” they said.

The researchers highlighted that multi-agent systems harness the collective intelligence of multiple agents, “enabling superior problem-solving capabilities compared to individual model approaches”. These systems excel in dynamic environments that demand problem-solving and collaborative teamwork.

To evaluate the effectiveness of TRANSAGENTS, the researchers couldn’t rely only on conventional translation evaluation methods that typically compare translations to a standard reference, as these don’t capture the complexity and subjectivity of literary texts. They noted that evaluating the accuracy and quality of literary translations presents a “particularly challenging task.”

To effectively address this challenge, they proposed two innovative evaluation strategies: Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP). 

Both strategies involve comparing a pair of translations from two different translation systems to determine which one is superior. In the first case, evaluators who speak the target language assessed the translations for fluidity, readability, and cultural appropriateness without reference to the original text, while in the second case advanced LLMs, such as GPT-4, compared the translations directly with the original text to evaluate their accuracy.

Significant Challenges for Translators

The researchers evaluated 24 chapters of web novels. Interestingly, both human evaluators and the LLM evaluator often preferred the translations produced by TRANSAGENTS over those created by human translators, despite TRANSAGENTS receiving lower scores on traditional metrics like d-BLEU.

An in-depth analysis showed that TRANSAGENTS excels in genres requiring domain-specific knowledge, such as historical contexts and cultural nuances. “These areas often pose significant challenges for translators,” they said. On the other hand, TRANSAGENTS tends to underperform in contemporary domains, which may not require as much specialized knowledge. Additionally, they observed that TRANSAGENTS is capable of generating translations with more diverse and vivid descriptions.

Furthermore, the researchers found that using TRANSAGENTS for literary text translation could result in an 80× reduction in costs compared to employing professional human translators.

Authors: Minghao Wu, Yulin Yuan, Gholamreza Haffari, Longyue Wang