Google Says PaLM 2 Beats Google Translate in Machine Translation

PaLM 2 Beats Google Translate in Machine Translation

A little over a year since Google launched its massive Pathways Language Model (PaLM), the Vice President at Google DeepMind, Zoubin Ghahramani, announced on a blog on May 10, 2023, the release of PaLM 2.

The announcement was echoed by Google’s CEO Sundar Pichai in one of several tweets, characterizing the large language model (LLM) as outperforming the previous version in natural language generation, translation, and reasoning.

Google stated that this new version of PaLM is also faster and more efficient than its predecessor and that over 25 new products and features are now enabled by the LLM. For the most part, those new products are versions of the model itself, trained on specialty subjects like medicine (Med-PaLM 2) and cybersecurity (Sec-PaLM).

The corpus is set to be much larger than the corpus used to train PaLM and to have a higher percentage of non-English data, enhancing multilingual tasks like translation and multilingual question answering.

In All Locales

Regarding specific results for machine translation, researchers reported in the LLM’s technical report that PaLM 2 has an overall better performance than PaLM and Google Translate, measured with tests run on WMT21 translation datasets and using human evaluators. “We observe that PaLM 2 improves not only over PaLM but also over Google Translate in all locales”, the team said in the technical report.

The results are detailed in section 4.5 of the technical report, including translation to and from English for Chinese and German. The human evaluators were professional translators (seven for English into German and four for Chinese into English). The quality model employed includes a scale assigning weights of 5 for each major error, 1 for each minor error, and 0.1 for minor punctuation errors.

“We observe that PaLM 2 improves not only over PaLM but also over Google Translate in all locales”.

Even though the PaLM 2 model improves upon its predecessor in things like gender agreement for high-resource languages, issues remain. One of those issues, mentioned in the technical report, is precisely “potential misgendering” occurring in zero-shot translation (i.e., without further training or fine-tuning).

Researchers also obtained lower gender agreement MT scores when translating into Telugu, Hindi, and Arabic with PaLM 2. 

In Case the Size of the PaLM Matters

PaLM 2 has something of a novelty, too, which is availability in four sizes: the submodels are, from smallest to largest, Gecko, Otter, Bison, and Unicorn. However, there is also mention of a “PaLM 2-L” in the technical report as the largest in the PaLM model family.

PaLM 2, on the other hand, is described as “significantly smaller than the largest PaLM model” at 14.7 billion parameters (in its original version, PaLM was a massive 540 billion parameter model). This model was pre-trained on multiple sources of data that included web documents, books, code, mathematics, and conversations. 

A lot is happening in generative AI at Google, and on the same date as the PaLM 2 announcement, Google revealed that it is also working on another LLM called Gemini.

Important Read: Realistic Dialogues, No Translation: Google Unveils Dataset for Virtual Assistant Training