How would you feel if your letter of resignation were posted online? Or sensitive parts of your employment contract? Or details of that M&A deal you have been working on with an investment bank? Thousands of people are about to find out unless Translate.com fixes its website and gets in touch with Google to delete what must be millions of indexed pages containing highly sensitive data.
Translate.com’s website offers a free machine translation service powered by Microsoft Translator. Because the site’s highly coveted domain attracts heavy web traffic, thousands, if not hundreds of thousands, of unsuspecting users looking for quick machine translation found their confidential data exposed on the internet.
The massive privacy breach was first uncovered on September 3, 2017 by Norwegian news agency NRK, which reported that employees of state-run oil giant Statoil had “discovered text that had been typed in on [Translate.com] could be found by anyone conducting a [Google] search.” Their reaction: “Wow, what is this?”
Anyone doing the same simple two-step Google search will concur. A few searches by Slator uncovered an astonishing variety of sensitive information that is freely accessible, ranging from a physician’s email exchange with a global pharmaceutical company on tax matters, late payment notices, a staff performance report of a global investment bank, and termination letters. In all instances, full names, emails, phone numbers, and other highly sensitive data were revealed.
“Wow, what is this?”
An expert contacted by NRK further uncovered “plans of workforce reductions and outsourcing, passwords, code information, and contracts.”
NRK contacted Translate.com, which explained that “they openly state that all texts being sent to the company in order to improve the quality of the translations.” NRK further quotes translate.com’s support team as saying, “Some of these enquiries were indexed by Google, so now we offer a simple solution for those who wish to remove these translations when they appear in a search engine.”
As the news spreads across Norway and other Scandinavian countries, companies began to react, with the Oslo Stock Exchange blocking access to Translate.com and, interestingly, Google Translate.
Translate.com likely feels protected from legal challenges because of a clause in the fine print of its Terms & Conditions stating that while they “will use reasonable measures to protect any content you provide to us for the purpose of completing the Services,” they “cannot and do not guarantee that any information provided to us by you will not become public under any circumstances. You should appreciate that all information submitted on the website might potentially be publicly accessible.”
Who is behind what must be among the language industry’s top five domain names? Back in December 2015, we published a brief history breakdown of the domain name Translate.com, which was first registered in 1996. It changed hands twice — and in both instances, it was bought by language service providers: Benemann Translation Center (later acquired by RWS Polyglot in 1998) and ENLASO Enterprise Language Solutions in 2004. The domain name’s current owner, Chicago-based Emerge Media, purchased it in 2012.
Slator’s most recent coverage on Translate.com was a VP hire in January 2016. The VP confirmed they were on a recruitment drive and planning to launch an enterprise website, but there are no apparent developments on that front as of yet. The VP has since left the company.
Emerge Media CEO Anthos Crysanthou has been described as a domain investor who appears to “flip” domains for a profit or try to build a business out of them. Some of the domains the company owns include:
- Podcasts.com – a collection of podcasts
- Directions.com – supposedly a hodgepodge of GMaps and Tripadvisor; currently down
- Bands.com – the domain is down; supposedly for music
- Information.com – very nebulous information organizer with some major sections down
- Womens-health.com – tried to do the usual women’s online magazine format but has since pivoted to become a women’s forum, possibly so it can stay afloat with user-generated content and organic traffic
- MuchGames.com – online game directory like Y8.com
Slator reached out to Emerge Media for comment but has not received a response as of press time.
The breach highlights the need for the language industry to ensure that customer information is kept secure and confidential, a recurring concern in industry marketing material, sales pitches, and client discussions.
Any organization and individual should review if confidential information has been exposed and take measures accordingly.
Update: In a blog post dated September 6, 2017, Translate.com attempts to justify the breach by pointing out that “there was a clear note on our homepage stating: ‘All translations will be sent to our community to improve accuracy’ and that ‘some of these requests were indexed by search engines such as Google and Microsoft at that time.’” Finally, they say that people can request for translations to be removed by emailing firstname.lastname@example.org.
Hat tip to Anne-Marie Colliander Lind from Inkrease for alerting us to this story.
Download the Slator 2019 Neural Machine Translation Report for the latest insights on the state-of-the art in neural machine translation and its deployment.