Google Translate Not Ready for Use in Medical Emergencies But Improving Fast — Study

Google Translate for use in medical emergencies

In recent years, communication between medical professionals and patients with limited English proficiency (LEP) has attracted attention and big bucks from a range of players across the US, from Google’s USD 100m investment in telehealth platform Amwell to AMN Healthcare’s USD 475m acquisition of remote interpreting provider Stratus Video (Pro).

As first covered by The Verge on March 9, 2021, a new study published in the Journal of General Internal Medicine has now shifted the focus from interpreting to translation. While general materials, such as information about medical conditions and diagnoses, are typically translated in advance into commonly spoken languages, patient-specific discharge instructions present a gap that is often bridged by machine translation.

The study is a joint effort between Dr Lisa C. Diamond of Memorial Sloan-Kettering Cancer Center in New York and, from Olive View-UCLA Medical Center in California, Drs Breena R. Taira and Vanessa Kreger and nurse practitioner Aristides Orue. Their goal, to objectively assess the accuracy of Google Translate for discharge instructions given patients leaving the ER.

As the paper explains, past research on Google Translate in medical contexts has produced mixed results about usability, depending on the languages studied and the latest version of Google Translate’s algorithm.

An article published in JAMA Internal Medicine in 2019 concluded that, for Spanish and Chinese, Google Translate could supplement but not replace written English instructions (interpreted for the LEP patient), and should include a warning about potentially inaccurate translations.

The UCLA / Sloan-Kettering study differentiates itself from others in a few key ways. The team deliberately analyzed translations into both widely spoken languages (Spanish, Chinese, Vietnamese, Filipino, and Korean) as well as those into languages of lesser diffusion commonly spoken in their area, and thus likely to be encountered in the ER (e.g., Armenian and Farsi). Since Google Translate improves based on user feedback, it performs differently for languages of lesser diffusion versus languages with many users worldwide.

“Our accuracy rates for these two languages as assessed by volunteers from the community were almost identical […] to those of professional translators”

Another key difference is the selection of reviewers. The 20 volunteers who evaluated the English discharge instructions and translations were bilingual community members — without professional experience as linguists or as healthcare workers — which the researchers believe more accurately represents typical levels of patient comprehension of medical text. (Most other comparable studies assign this work to translators.)

Translators and Bilinguals Think Alike

The volunteers analyzed 20 free-form written patient discharge statements frequently used in the ER, and the corresponding translations into seven languages, for a total of 400 translations evaluated. Although mean scores for fluency, adequacy, meaning, and severity were high, they varied significantly by language.

SlatorCon London 2024 | £ 980

SlatorCon London 2024 | £ 980

A rich 1-day conference which brings together 140+ industry leaders views and thriving language technologies.

Buy Tickets

Register Now

“Overall, [Google Translate] accurately conveyed the meaning of 330/400 (82.5%) instructions examined but the accuracy varied by language from 55 to 94%,” the authors wrote, describing some of the errors as “nonsensical.”

As expected, Spanish and Chinese translations were most accurate (94 and 82%, respectively), while Armenian and Farsi had accuracy rates of 55 and 67.5%.

“The difference between patient perception of machine translations and a professional translator’s perception has been an ongoing question”

The bilingual reviewers also pointed out several language-specific issues that might further impede understanding, such as the different writing systems used for traditional versus simplified Chinese, as well as differences among variants of Farsi, including Dari and Tajik. Notably for Farsi, which is written from right to left, Google Translate initially transposed the text from left to right, rendering the translations illegible.

The accuracy rates for Chinese and Spanish translations, based on community volunteers’ assessments, nearly matched those established by professional translators in the 2019 Journal of Internal Medicine study. As the researchers put it: “Our accuracy rates for these two languages as assessed by volunteers from the community were almost identical […] to those of professional translators.” I Recruit Talent. Find Jobs

LocJobs is the new language industry talent hub, where candidates connect to new opportunities and employers find the most qualified professionals in the translation and localization industry. I Recruit Talent. Find Jobs

The researchers further noted: “This is important information for future work in this area as the difference between patient perception of machine translations and a professional translator’s perception has been an ongoing question.”

Citing the inconsistent performance between languages, the authors concluded, “Although the future of written translation in hospitals is likely machine translation, [Google Translate] is not ready for prime time use in the emergency department.”

For the time being, best practice relies on professional interpreters.