Barely two years after bursting into the translation tech scene, neural machine translation (NMT) is everything the MT community is talking about. Microsoft, Google, Facebook, and other large technology companies have all transitioned to NMT, as did the European Patent Office and the World Intellectual Property Organization. Even end-buyers are starting to build their own systems based on open-source models.
NMT systems are data and power hungry. In late August, Chinese internet company Sogou invested into translation data provider UTH to secure a large corpus of quality data.
Against this backdrop, enter DeepL, an online machine translator developed by the founder of Linguee (the online dictionary/translation database that typically ranks within the top three organic Google results when doing a search for an English translation of a German word). Linguee’s CEO is former Google researcher Gereon Frahling.
Not that other MT providers are prone to excess modesty when launching their products, but German-based DeepL comes out of the gate guns blazing.
According to DeepL’s press material, the system runs on the world’s 23rd largest supercomputer in Iceland (for cheap power), where it can translate a million words in under a second and achieve record BLEU scores. And DeepL isn’t done yet. It also claims that its “revolutionary neural architecture” makes it the “most accurate and natural-sounding machine translation tool.”
DeepL goes on to say that human translators preferred DeepL in a blind test by a factor of 3:1 when compared against competing systems from Google, Microsoft, and Facebook.
Slator reached out to two leading machine translation experts to get their take on DeepL. We also contacted DeepL but have not received a response as of press time.
John Tinsley, CEO of Iconic Translation Machines, said: “These are certainly bold claims but there’s not really a lot of information about how they’ve carried out their evaluations to allow anyone to substantiate them at this time. I’ll be curious to see more when they release their API in the coming months. I’m sure some of the MT research groups will be interested to apply some academic rigour to the evaluations. It’s also quite impressive to have the 23rd most powerful supercomputer in the world…”
“These are certainly bold claims but there’s not really a lot of information about how they’ve carried out their evaluations to allow anyone to substantiate them at this time” — John Tinsley, CEO, Iconic Tanslation Machines
Kirti Vashee, an independent technology and marketing consultant and publisher of eMpTy Pages, commented on the launch: “I think they are using well understood and public test sets and claiming that they have better results. To my view they are slightly better, but far from revolutionary. Anyway, being better is preferred to being worse. The graphics are somewhat over-dramatic based on the actual facts, but they do have a deep knowledge of the data, and Linguee is one of the best big data implementations of language data, so this will definitely be a challenge to other MT players in the market.”
Vashee continued, “It will depend on how much customization they allow for the enterprise market, but they already have enough traffic from Linguee (ad revenue) to heavily subsidize this effort and will put pressure on the marginal (small) MT players, I suspect.” Vashee concluded that it was too early to say more at this point until they release more details.
“They do have a deep knowledge of the data, and Linguee is one of the best big data implementations of language data, so this will definitely be a challenge to other MT players in the market” — Kirti Vashee, independent technology consultant and publisher of eMpTy Pages
Techcrunch, meanwhile, ran a quick test and came away impressed, calling DeepL’s German-English translation of a news article “more accurate and nuanced than any we’ve tried.”
Slator did its own, wholly unscientific experiment, pitting DeepL against Google Translate by using three short paragraphs taken from a Bloomberg article on an antitrust probe against Apple in China. However, we used the typically harder English-into-German direction.
Our initial impression is that DeepL is indeed somewhat more fluent for shorter sentences. One translation that particularly stood out was DeepL’s translation for this English sentence:
“The review is preliminary and Chinese antitrust agencies usually review such information before deciding whether a official probe is needed” (translation in image below).
DeepL’s translation was correct and indeed much more fluent for this particular sentence than Google Translate. Meanwhile, no Google-translated sentence seemed unambiguously better translated than its DeepL equivalent.
But here’s the caveat: in longer sentences, DeepL and Google Translate both break down and produce something that is near-impossible to understand as a stand-alone German sentence without having read the original English. That said, NMT is known to struggle with longer sentences.
Still, Kirti Vanshee is right in concluding that DeepL is “definitely one to watch.”
Join us at SlatorCon New York on October 12, 2017, and get an update on the state-of-the art in neural machine translation from one the leading researchers in the field, Kyunghyun Cho, Assistant Professor of Computer Science and Data Science at New York University.