How Large Language Models Mimic the Human Translation Process

Large Language Models Mimic Translators

In a research paper published on May 6, 2023, a team of researchers from Shanghai Jiao Tong University, Tsinghua University, and Tencent AI Lab demonstrated that large language models (LLMs) can emulate human translation strategies.

As Zhiwei He, Tian Liang, Wenxiang Jiao, Zhuosheng Zhang, Yujiu Yang, Rui Wang, Zhaopeng Tu, Shuming Shi, and Xing Wang explained in the above-mentioned paper “professional human translators tend to take preparatory steps when working with a given source text, including gathering or analyzing information such as keywords, topics, and example sentences.”

Traditional machine translation (MT) that focuses on direct source-to-target mapping disregards these preparatory steps, but LLM-based translation can mimic the human translation process, according to the researchers.

To that end, they proposed a method called MAPS — which stands for Multi-Aspect Prompting and Selection — to incorporate these preparatory steps into the LLM translation process. 

MAPS involves prompting LLMs to analyze the source sentence and extract translation-related knowledge and then integrate it into the prompt’s context, guiding the LLM toward generating more accurate translations.

Knowledge Mining, Integration, and Selection

More specifically, MAPS comprises three main steps: knowledge mining, knowledge integration, and knowledge selection. In the knowledge mining step, the LLM analyzes the source text and generates three types of translation-related knowledge: 

  • keywords — are crucial for conveying the core meaning and ensuring faithfulness and consistency throughout the text,
  • topics — help translators avoid mistranslation due to ambiguity and adapt to specific subject matters, and
  • demonstrations — provide examples that aid in finding suitable equivalents in the target language producing translations that are natural, fluent and engaging.

MAPS involves prompting LLMs to analyze the source sentence and extract translation-related knowledge and then integrate it into the prompt’s context.

The acquired knowledge “serves as background context” and is then integrated into the LLM’s prompt context during the knowledge integration step. This integration serves as a guide for the LLM to generate more accurate translation candidates. By incorporating the extracted knowledge, the LLM gains a better understanding of the source text and can produce translations that align with the intended meaning.

However, “not all of LLM-generated knowledge is useful for translation,” said the researchers. They explained that LLMs “may generate trivial or noisy content that might distract the translation process.” 

To further enhance translation quality the knowledge selection step employs a filtering mechanism. This mechanism aims to eliminate noise or unhelpful knowledge that may be generated by the LLM.

More specifically, reference-free quality estimation (QE) was employed to rank translation candidates and select the one with the highest QE score as the final output. Additionally, the possibility of using the LLM itself as a QE scorer is explored, demonstrating the potential of a pure LLM implementation.

No More Hallucinations and Domain-Specific Preparation

Comprehensive experiments across eight translation directions — English-Chinese, Chinese-English, English-German, German-English, English-Japanese, Japanese-English, German-French, French-German — were conducted to validate the effectiveness of the MAPS approach. 

The results consistently demonstrated significant improvements over other baselines, indicating that incorporating preparatory steps and leveraging self-generated knowledge led to higher-quality translations.

MAPS also effectively mitigated hallucination issues in translation, where the LLM generates inaccurate or fictional content. As the researchers explained, the extracted knowledge was “critical in resolving up to 59% of hallucination mistakes in translation.”

Another notable advantage of MAPS is its focus on translating general scenarios without relying on domain-specific assumptions. Unlike other LLM-based translation approaches that require domain-specific preparation, MAPS eliminates the need for extensive glossaries, dictionaries, or sample pools. This feature enhances the practicality and versatility of MAPS for various translation tasks and language pairs.

The code is available on GitHub at