In-context machine translation, a novel approach utilizing large language models (LLMs) has gained attention for its ability to perform machine translation (MT) by learning from examples of translation pairs — known as prompts — in the prefix. This approach shows promise in quickly adapting translation models on-the-fly without requiring additional training.
An essential aspect of in-context learning is prompt selection, with previous work suggesting the importance of selecting examples that are similar to the source sentence being translated.
In a research paper published on May 5, 2023, Suzanna Sia and Kevin Duh from John Hopkins University propose a shift in perspective, focusing on coherence — the semantic and syntactic consistency within a body of text — as a critical factor for translation success in LLMs.
They focused on two types of coherence: domain coherence and document — also known as local — coherence, which “are previously unexplored for in-context machine translation.” They argue that maintaining coherence between prompts and test sentences — both at the domain level and within the context — significantly enhances the quality of translations.
Domain Adaptation On-the-fly
To investigate the role of coherence, the authors conducted experiments using three open-access LLMs (GPTNeo2.7B, Bloom3B, XGLM2.9B) across four domains (medical, social media, Wikipedia, and TED talks). The results of their experiments indicate that when prompts are drawn from the same domain and there is a coherent context, translation performance improves.
More specifically, they observed that when prompts are consistent with the domain, models can adapt more effectively on-the-fly, resulting in improved translation performance. In addition, inducing a similar writing style or including relevant lexical translation examples leads to the generation of coherent translations by the models.
LLMs are “able to do style transfer just from instructions or from being shown surface prompt examples,” said the authors. “Simply providing demonstrations from the same domain may induce the LLM to generate a similar style which is coherent with the target text,” they added.
The authors also explored the impact of prompt length on translation quality across different domains. They compared translation examples consisting of 5-10 words and 15-20 words, finding that longer examples provide more evidence to the model, potentially affecting downstream performance. However, the difference in translation quality based on prompt length was marginal, requiring further investigation.
In addition to domain coherence, the authors highlight the significance of local coherence as another crucial factor for translation quality in LLMs. Local coherence is achieved through the use of a moving window of previously translated sentences as part of the prompt examples for a specific test sentence.
More specifically, the moving window approach considers a subset of previously translated sentences that directly precede the test source sentence. This subset serves as a coherent context for the model, providing it with relevant information and linguistic cues to generate accurate translations.
The authors conducted experiments to compare different prompt selection methods and found that the moving window approach demonstrates superiority in improving translation performance. They also observed that randomly sampling sentences from within the document performs well compared to other similarity-based retrieval methods from outside of the document. “This further highlights that coherence is a critical factor for in-context machine translation,” they said.