Stanford Zeroes in on Machine Translation Gender Bias, Intento Data in Annual AI Report

Stanford University’s Institute for Human-Centered Artificial Intelligence

Stanford University’s Institute for Human-Centered Artificial Intelligence has published its annual report on artificial intelligence highlighting the major developments and trends in the industry.   

The 386-page behemoth (full report here) includes chapters on AI research and development, technical performance, ethics, governance, diversity, the state of AI education and the technology’s economic impact. 

The report emphasizes that “industry”, or companies, as opposed to academic institutions or research collectives, have taken a dominant role in the development of machine learning models. In 2022, industry produced 32 “significant” models, compared to just three from academia. 

The dominance of private industry in the development of machine learning models is due at least in part to the rising cost of training LLMs, with the report supporting claims that “these models cost millions of dollars to train and will become increasingly expensive with scale.” 

Deep-pocketed public companies like Google, Microsoft and Meta spend billions on AI R&D each year, but the Stanford report notes that AI private investment fell for the first time in a decade in 2022. Global private investment in AI fell 26.7 percent in 2022 to USD 91.9bn, according to the report, with the total number of newly-funded AI companies declining as well.  

The report is extremely broad, touching on many topics and technologies relevant to the AI industry, but relies heavily on third-party data and barely scratches the surface on AI in the translation, localization, and language technology industry.

Selective Focus

When it comes to machine translation, the Stanford researchers selectively focused on Intento data showing that there were 54 independent machine translation services in 2022, compared to 46 in 2021, and left it at that. That number includes 45 that are offered commercially, four that are open-source, and five in preview.

There was also a section on fairness in machine translation in the report for the first time since its inception in 2017. This section focuses on gender bias, specifically the decline in translation quality when the correct English translation includes the pronoun “she”. The report relies on a 2022 study from Google researchers that found “language models consistently perform worse on machine translation to English from other languages when the correct English translation includes “she” pronouns as opposed to “he” pronouns.”

The issue of gender bias in machine translation is of course well-known and has been written about extensively for years, but the proliferation of large language models like ChatGPT and their growing impact on the translation and localization industry is making the issue mainstream.

Recent efforts to address gender bias include the EU tying a EUR 20m tender for language technologies in part to controlling gender bias, Amazon’s new gender evaluation benchmark for machine translation, and Meta’s mobile localization infrastructure with features to help deliver gendered translations more efficiently.

The report also included a discussion on speech recognition, or speech-to-text, technology, saying that many services can “seamlessly transcribe speech into writing.” The authors highlight progress on reducing the error rate for speaker recognition, the task of matching text to a particular individual, a potentially interesting use-case for captions services. A model in 2022 achieved a 0.1% equal error rate on the VoxCeleb dataset for speaker recognition, a 0.28 percentage-point decrease from the best results from 2021, according to the report.