logo image
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs
MENU
  • News
    • People Moves
    • Deal Wins
    • Demand Drivers
    • M&A and Funding
    • Financial Results
    • Technology
    • Academia
    • Industry News
    • Features
    • Machine Translation
    • — Divider —
    • Slator Pro
    • — Divider —
    • Press Releases
    • Sponsored Content
  • Data & Research
    • Research Reports & Pro Guides
    • Language Industry Investor Map
    • Real-Time Charts of Listed LSPs
    • Language Service Provider Index
  • Podcasts & Videos
  • Events
    • Design Thinking – February 2021
    • — Divider —
    • SlatorCon Coverage
    • Other Events
  • Directory
  • RFP Center
  • Jobs

Advertise on Slator! Download the 2021 Online Media Kit Now

  • Slator Market Intelligence
  • Slator Advertising Services
  • Slator Advisory
  • Login
Search
Generic filters
Exact matches only
Advertisement
Why Translating Chinese User Reviews May Disappoint Danish Tourists

5 years ago

May 11, 2016

Why Translating Chinese User Reviews May Disappoint Danish Tourists

Academia ·

by Marion Marking

On May 11, 2016

5 years ago
Academia ·

by Marion Marking

On May 11, 2016

Why Translating Chinese User Reviews May Disappoint Danish Tourists

User-generated content (UGC) means big business for machine translation providers and language service providers trying to capture non-critical content where only rock-bottom rates will do.

Younger players like Gengo, Unbabel, and KantanMT, as well as more established LSPs like Welocalize, do their content marketing part to make the case for why UGC deserves translation.

Knowing how a reviewer’s language influences a user rating for a tourist attraction and determining the relevance of that rating to speakers of other languages should have practical benefits for both buyers and vendors of UGC translations.

Advertisement

Now comes an interesting study by Scott Hale, Data Scientist at the Oxford Internet Institute, on how language affects ratings on TripAdvisor.co.uk. Slator reached out to Hale to learn more about the study.

“For many large user-generated content platforms, less than half the content is in English and many users do not speak English as a native language,” Hale points out in his study. He says future user growth and their contributed content will predominantly be in a language other than English given that “Internet-penetration rates are already high in most English-speaking countries.”

Hale says that, while the earliest TripAdvisor reviews from 2001 were all in English, from 2006 on “non-English reviews grew quickly,” the top eight languages being French, Spanish, Danish, Italian, Japanese, Portuguese, and Russian (Figure 2).

Figure 2
Figure 2. Number of TripAdvisor user reviews on London attractions from 2001–2015 for top eight languages

Taking a look at all 516,641 reviews on TripAdvisor.co.uk of some 3,040 London tourist attractions in July 2015 (hotels and restaurants not included), he notes, “25% of all reviews of London attractions were not in English. Just over half of all attractions had at least one non-English review, and 175 attractions (6%) had more non-English than English-language reviews.”

Different Strokes

Hale’s study also shows that speakers of different languages and, by extension, cultures, have an impact on ratings (number of stars), with some language pairs showing more similar star ratings than other vastly disparate ones.

For instance, German, Norwegian, and French star ratings “are strongly correlated” with those in other languages. “In contrast, ratings in languages such as Portuguese and Japanese are less strongly correlated,” the study points out (Figure 1).

Figure 1
Figure 1. Similarity with which users writing in different languages rated the same London attractions on TripAdvisor; each cell gives mean correlation of star ratings between a language pair.

Asked to interpret these findings, Hale tells Slator there are many reasons why star ratings vary across languages. “If a museum, for example, has an audio guide available only in the five big European languages, then visitors speaking one of those languages probably get more information and, ultimately, have a different experience” compared to those who cannot speak one of the audio-guide languages.

Hale admits, “I do not know in particular why Portuguese and Japanese are less correlated. One possible reason that Japanese ratings are less correlated with other languages is comprehension and comfort in using foreign languages.”

Another study Hale did on bilingual editing of Wikipedia showed the Japanese less likely to engage with foreign-language content compared to speakers of other languages.

Hales tells us, “Language can also capture elements of culture, of course, and it may be that people coming from different countries or cultures evaluate tourist attractions with different criteria.”

“Users may derive some utility from the star ratings of reviews in languages they do not read, possibly more from rough machine translations of the review text”—Scott Hale

Stars and Outliers

Regarding outliers (Figure 1) among language pairs with low star-rating correlations such as Chinese-Danish (-0.11) or Polish-English (0.12) or a high correlation like French-Spanish, Hale explains, “There are relatively few tourist attractions that are reviewed in Chinese and Polish; so, the correlations between these languages and others may simply be noise.”

As for French-Spanish, he says, “The data suggest French and Spanish tourists, in general, review attractions similarly.” These tourists, he says, may also “have similar a priori criteria” for evaluating an attraction, or information could have been available in French and Spanish at the tourist spot.

About TripAdvisor’s so-called star ratings, Hale says this “form of non-personalized recommendation,” based on this study at least, “are fairly good at capturing the most common opinion.”

However, anomalies may occur in star ratings as well. Hale shares a survey with Slator of reviews in French and English for the Cirque du Soleil show in London. While French and English speakers generally agree on tourist attraction ratings, he says, “this is one example where they really disagree.” (Figure 3)

Figure 3
Figure 3. French and English reviews on TripAdvisor for the Cirque du Soleil show in London

He qualifies though that “there are far fewer reviews in French than English.” As Hale points out in his TripAdvisor study, “The average star rating (1–5 stars) is sensitive to the number of reviews. With a small number of reviews, a single rating can be over represented.”

Moreover, “Users may derive some utility from the star ratings of reviews in languages they do not read and possibly more from rough machine translations of the review text.” Machine translations for TripAdvisor are partially provided by Bangkok-based Asia Online as Slator has learned from sources.

Hale recommends that site designers get a handle on user-generated content to know “what to do when there are few or no reviews in a person’s preferred languages.” One way of doing that is “calculating the correlations between languages and countries,” and leaning exactly how to deploy MT so star ratings will not be misleading.

Hale asks, “If there are few reviews in Finnish for a Finnish user, would it be better to show Swedish or English reviews as well, or no other reviews at all?”

Hale hopes to answer that question soon. He discloses, “I am currently working on behavioral experiments to understand how people respond to foreign-language reviews.”

TAGS

Asia OnlineGengoKantanMTmachine translationonline travel industryOxfordScott Haletourismtourism industrytouristtourstraveltravel and hospitalitytravel industryTripAdvisorUGCUnbabeluser-generated contentWelocalize
SHARE
Marion Marking

By Marion Marking

Slator consultant and corporate communications professional who enjoys exploring Asian cities.

Advertisement

SUBSCRIBE TO THE SLATOR WEEKLY

Language Industry Intelligence
In Your Inbox. Every Friday

SUBSCRIBE

SlatorSweepSlatorPro
ResearchRFP CENTER

PUBLISH

PRESS RELEASEDIRECTORY LISTING
JOB ADEVENT LISTING

Bespoke advisory including speaking, briefings and M&A

SLATOR ADVISORY
Advertisement

Featured Reports

See all
Slator 2020 Language Industry M&A and Funding Report

Slator 2020 Language Industry M&A and Funding Report

by Slator

Slator 2021 Data-for-AI Market Report

Slator 2021 Data-for-AI Market Report

by Slator

Slator 2020 Medtech Translation and Localization Report

Slator 2020 Medtech Translation and Localization Report

by Slator

Pro Guide: Sales and Marketing for Language Service Providers

Pro Guide: Sales and Marketing for Language Service Providers

by Slator

Press Releases

See all
XTRF Launches a Bi-Monthly Free Networking Event for Localization Professionals

XTRF Launches a Bi-Monthly Free Networking Event for Localization Professionals

by XTRF

150 Million Words Translated: the German EU Council Presidency Translator Sets New Records

150 Million Words Translated: the German EU Council Presidency Translator Sets New Records

by Tilde

BeLazy Announces Full Automation for Plunet

BeLazy Announces Full Automation for Plunet

by BeLazy

Upcoming Events

See All
  1. Memsource MT Post-Editing Pricing Models Webinar

    Pricing Models for MT Post-Editing Workshop

    by Memsource

    · February 3

    Hear a panel of innovative localization professionals share different approaches for MT post-editing pricing.

    More info FREE

Featured Companies

See all
Text United

Text United

Memsource

Memsource

Wordbank

Wordbank

Protranslating

Protranslating

Seprotec

Seprotec

Versacom

Versacom

SDL

SDL

Smartling

Smartling

Lingotek

Lingotek

XTM International

XTM International

Smartcat

Smartcat

Translators without Borders

Translators without Borders

STAR Group

STAR Group

memoQ Translation Technologies

memoQ Translation Technologies

Advertisement

Popular articles

Why Netflix Shut Down Its Translation Portal Hermes

Why Netflix Shut Down Its Translation Portal Hermes

by Esther Bond

The Slator 2020 Language Service Provider Index

The Slator 2020 Language Service Provider Index

by Slator

Top Language Industry Quotes of 2020

Top Language Industry Quotes of 2020

by Monica Jamieson

SlatorPod: The Weekly Language Industry Podcast

connect with us

footer logo

Slator makes business sense of the language services and technology market.

Our Company

  • Support
  • About us
  • Terms & Conditions
  • Privacy Policy

Subscribe to the Slator Weekly

Language Industry Intelligence
In Your Inbox. Every Friday

© 2021 Slator. All rights reserved.

Sign up to the Slator Weekly

Join over 13,000 subscribers and get the latest language industry intelligence every Friday

Your information will never be shared with third parties. No Spam.