Now Europeans Get to Enjoy Google Bard’s Linguistic Overconfidence

Google Bard Translation

Google’s large language model (LLM) Bard is now available across Europe and in Brazil, for users working in more than 40 languages — a more than tenfold increase since Bard entered the public LLM scene in May 2023 with capabilities for English, Japanese, and Korean. 

In a July 13, 2023 blog post, Bard Product Lead Jack Krawczyk and VP of Engineering Amarnag Subramanya announced the update as “Bard’s biggest expansion to date.”

In addition to the dozens of languages now covered by Bard, the LLM is now able to read responses aloud in the same range of languages, using advances in speech-to-text (STT) technology.

“This is especially helpful if you want to hear the correct pronunciation of a word or listen to a poem or script,” the blog post stated. “Simply enter a prompt and select the sound icon to hear Bard’s answers.”

Google Lens has also been embedded in Bard; users can upload an image, which Bard analyzes and incorporates in its response, although this feature is currently available only in English. 

“We are committed to rolling out Bard responsibly,” Krawczyk and Subramanya wrote. “Some features may not be available in all countries, territories, and languages, but we will work on expanding availability over time.”

Bard displays greater self-awareness of its capabilities and limitations, accurately telling users that it can translate between 40 languages, which it subsequently lists. It does, however, note that it is still under development, and always learning new languages. 

“If you need to translate a language that is not on this list, please let me know and I will try my best to help you,” Bard adds.

It Gets (Somewhat) Better

Bard successfully translates from English into languages specifically listed as covered, including German, French, and Russian, providing not only accurate output but also offering a line-by-lane “breakdown” of each translation.

Translations without English as a pivot language are rougher and sometimes include explanations that do not quite add up. Walking users through a Serbian-Arabic translation, Bard incorrectly states that the word “slana” means “salty” in both the source and target languages. 

Similarly, instead of listing individual translations for specific words, Bard enumerates a number of words that were “all translated as their respective Arabic equivalents.” And for languages written from right to left, Bard’s punctuation is less than perfect.

The quality of Bard’s STT capabilities  —  namely, the “naturalness” of the voice — also varies by language. Bard does not seem to be able to read aloud text in a mix of languages — yet.

One constant since Bard’s foray onto the scene: The LLM is still confident in its ability to subtitle a movie, and even offers to provide users with feedback on the quality of their own subtitles.

When asked to transcribe audio from a link to a video, Bard comes close to providing relevant output.

Trained Human Reviewers

In several instances, when provided a YouTube link, Bard seemed to understand that the sketch in question was related to Saturday Night Live, before going off on its own tangent. 

Rather than transcribing a parody pizza commercial, Bard returned dialogues between real-life Weekend Update personalities discussing the Supreme Court’s decision to overturn Roe v. Wade.

The back-and-forth could actually have been funny, given comedic timing and body language, but it turns out Bard’s skit was offered up in earnest, as evidenced by Bard’s narrative retelling of the dialogue.

“The camera cut to black, and the two men walked off the set. They knew that they had a lot of work to do, but they were determined to fight back against this injustice,” Bard concluded.

Human reviewers, tasked with processing conversations with Bard as part of the “model improvement process,” also have their work cut out for them.

“Our trained human reviewers look at conversations to assess their quality related to the input prompt and determine if Bard’s response is low-quality, inaccurate, or harmful,” Bard’s support page states. “From there, trained evaluators suggest higher-quality responses in line with a defined set of policies, and these are then used as fine-tuning data to provide Bard a better dataset to learn from so it can produce improved responses in the future.”