Yes, you can use ChatGPT for translation. After OpenAI’s ChatGPT was released in November 2022 it has taken the world by storm, including the language translation and localization industry, a USD 27bn market.
“No other technology or tool has penetrated to the level” that ChatGPT has, Ramsri Goutham Golla, founder of Supertranslate.ai told SlatorPod. It seems everyone — Slator included — has been testing it out for the translation use case. Whether ChatGPT is an “absolute game changer” (Frederik Pedersen, CEO of EasyTranslate) remains to be determined.
At the time of writing, three major actors have published results of studies on the quality of translations generated by ChatGPT.
Chinese tech company Tencent’s January 2023 paper covered 12 translation directions and compared the output with commercial machine translation (MT) engines, such as Google Translate, DeepL, and Tencent’s own system. The study randomly reviewed only 50 sentences from each set for evaluation.
Intento’s article released in January 2023 compared English-Spanish and English-German translations done by ChatGPT. 500 segment samples with reference translations were set against Amazon, Apptek, Baidu, DeepL, Google, IBM, Microsoft, ModernMT, Niutrans, Systran, and Ubiqus.
In a paper published in February 2023, Microsoft compared GPT language models to both research and commercial MT engines in 18 high- and low-resource language pairs. They used publicly available datasets and tested at both sentence- and document-level. Multiple metrics as well as human evaluation were employed.
ChatGPT Performs “Competitively”
Microsoft concluded that GPT models produce a “very competitive translation quality for high resource languages” which is in agreement with Tencent’s conclusion that ChatGPT performed “competitively” for high-resource European languages.
Similarly, Intento revealed that “GPT-3 translations are among the best commercial MT engines for English to Spanish, but they fall into the second tier for English to German.” Human LQA found issues including omissions, Do Not Translate errors (DNTs), terminology mistakes, and some questionable grammar.
Low-Resource = Low Quality
Across the board ChatGPT’s translation quality for low-resource languages was found to be inferior. Microsoft described ChatGPT as having “limited capabilities for low resource languages” and Tencent’s BLEU score for English-Romanian was 46.4% lower than it was for English-German.
Tencent nuanced this stating that translation between language families is more difficult than within language families; the performance gap was even greater when translating low-resource language pairs from different language families. E.g., Romanian-Chinese.
Commenting on the results of their domain-specific testing (legal and healthcare), Intento stated, “GPT-3 translations are not up to par”. Equally, Tencent found that ChatGPT fell short of Google Translate and DeepL for Medline abstracts and Reddit comments.
ChatGPT’s own judgment of its’ translation capabilities echoed the findings of Tencent, Intento, and Microsoft: “Yes, you can use ChatGPT for translation…However, it’s important to note that ChatGPT’s translation ability may not be as accurate or precise as a specialized translation tool or a professional translator. Additionally, ChatGPT may not be able to translate certain technical or domain-specific terms accurately. Nonetheless, ChatGPT can be a useful tool for basic language translation tasks.”