Large language models (LLMs) have showcased impressive capabilities in multilingual neural machine translation (MNMT) even in the absence of parallel data. However, they often struggle when translating rare words, especially in low-resource scenarios.
In a research paper published on May 24, 2023, a team of researchers from the Chinese University of Hong Kong and Microsoft presented a novel approach to address this issue. They demonstrated that by incorporating chained multilingual dictionaries as prior hints, LLMs can effectively guide their decision-making process during the translation of input texts.
The research team, including Hongyuan Lu, Haoran Yang, Wai Lam, Furu Wei, Haoyang Huang, and Dongdong Zhang, introduced a novel framework called Chain-of-Dictionary Prompting for Machine Translation (COD), which utilizes chains of multilingual dictionaries to prompt LLMs for machine translation (MT) tasks.
The CoD framework leverages the power of multilingual dictionaries to augment LLMs. By integrating chained multilingual dictionary information directly into the translation prompt, COD provides valuable prior knowledge that guides LLMs’ decision-making process and helps overcome challenges related to rare words and low-resource languages.
COD consists of two sections: the standard translation prompt and the chained multilingual dictionaries. When presented with a source sentence, COD searches for the relevant multilingual dictionary entries for a subset of the words. Before making the conventional translation request to LLMs, COD appends additional textual inputs to the prompt, as explained by the researchers.
Background Knowledge
The incorporation of lexicon hints for MT in this method involves adding a specific string to the beginning of the standard MT prompt. The motivation behind this approach stems from the successful use of bilingual dictionaries in supervised machine translation models. Additionally, it incorporates the principles of Chain-of-Thought (CoT) reasoning, which represents the reasoning process through intermediate thinking steps.
“In our case, we show how to incorporate multilingual knowledge in a zero-shot manner without requiring any model training by chaining the multilingual dictionary that represents words with the same meaning in different languages to improve LLM’s MNMT capabilities,” said the researchers.
The authors further emphasized that this allows the task to be specified in the prompt and provides background knowledge that is instrumental in accomplishing the MT task, without imposing strict constraints on how the model utilizes this knowledge.
Unlocking the Reasoning Abilities of LLMs
To evaluate COD’s effectiveness, the researchers conducted extensive experiments using various language models and the FLORES-200 benchmarks. The results were remarkable, indicating notable improvements in low-resource translation across multiple language pairs. In fact, COD outperformed ChatGPT in a substantial number of languages, successfully enabling translations that were previously challenging or even impossible for LLMs.
The analysis of COD’s performance highlighted the necessity of chaining multilingual dictionaries when prompting LLMs. By linking dictionaries representing words with the same meaning in different languages, COD unlocks the reasoning abilities of LLMs without requiring additional model training. This approach surpasses the limitations of in-context learning and few-shot demonstrations, providing a more practical and accessible solution, particularly for low-resource languages.
“Compared to in-context learning that uses few-shot demonstrations to better prompt the LLMs, dictionaries are comparatively easier to store and acquire than the demonstrations, particularly for low-resource languages,” explained the researchers.