Mostly, Google Translate is very accurate. Since its inception in 2006, it has become one of the top-rated machine translation (MT) tools, currently supporting 133 languages, having added 24 in 2022. Accuracy varies depending on language pair and content type, though some reports show Google Translate reaching 94% accuracy.
Google’s 2016 shift to Neural Machine Translation (NMT) represented a turning point for output quality. The tech giant suggests GNMT reduced translation errors by over 60% for several major language pairs. It also eliminated the need to translate via English as an intermediary with “zero-shot translation”.
Accuracy is subjective — texts may require technical accuracy and/or natural-sounding accuracy, or simply just be ‘good enough’, not perfect. Various studies have delved into how accurate Google Translate really is and how it compares to other MT tools, but the definition of ‘accurate’ differs.
Many automated evaluation metrics have been used for measuring MT output quality. While not perfect, the most commonly used is Bilingual Evaluation Understudy (BLEU). Others include NIST, Word Error Rate, and METEOR. Facebook patented its own alternative in 2019 and Meta proposed XSTS (a cross-lingual variant of Semantic Textual Similarity) in 2022.
Google Translate on Top?
A 2011 accuracy study on 51 languages found Google Translate performed strongly for European languages, but less so for Asian languages. Of course, that study is outdated now. A reevaluation in 2019 using the same text and metric showed a 34% improvement.
Google Translate is now considered one of the best in terms of reliability and accuracy, particularly for high-resource languages. Intento (2022) ranked Google Translate first above 18 other engines for almost all language pairs.
In DeepL’s evaluation (2020), its own MT tool consistently took first place for all language pairs, with Google Translate in second. It should be noted that DeepL outranking Google Translate is generally specific to European languages and for natural-sounding accuracy since DeepL is typically better at preserving context and handling colloquialisms and slang. While remaining wary of bias, DeepL’s evaluation generally matches other independent analyses.
Questions have been raised surrounding the use of Google Translate in specific settings. The current consensus is not to use Google Translate in medical, police, or creative translation settings.
The US Department of Health and Human Services has proposed a new rule outlining when/how MT may be used in healthcare situations. A study on using Google Translate in the ER corroborates studies showing Google Translate’s output for medical communications is not perfect with significant variation in accuracy levels between languages.
A 2018 study established 92% accuracy for English to Spanish and 81% for English to Chinese. A 2019 analysis of translations from English to seven target languages found accuracy ranged from 94% for Spanish to 55% for Armenian.
In a 2022 study reviewers preferred human translations over MT translations 85% of the time. Similarly, Slator’s 2022 SaaS Localization Report found that UI-related content where “context is critical” is usually done by professionals. However, Google’s new translation service for game developers proposes translation for high-risk content, such as in-app text.
Good, but not yet Perfect
Unlike human translators, Google Translate cannot ask questions, understand irony, conduct research, or ensure completeness. While Google Translate might be the best of the bunch, depending on language pair and content type, the following issues still require rectification:
- word order;
- gender (where it is not explicit);
- tone (e.g., formal vs informal pronouns);
- confusion with word type;
- language variants; and
- instances where elements do not follow the grammatical rules.
Google is continually developing its MT tool – announcing several new features in 2023 – and although it may not be perfect, Google Translate will irrefutably continue to learn and improve.