4 months ago
September 26, 2019
Google Resumes Human Transcription of Assistant Audio Content
Google has revived its transcription programs for Google Assistant, in which “human reviewers may listen to audio snippets [from users] to help improve speech technology,” according to a September 23, 2019 statement.
The statement, which focuses on Google’s beefed-up data privacy protections, explains that audio data from Google Assistant is not stored by default. Instead, users can opt in to help “improve the Assistant for everyone by allowing us to use small samples of audio to understand more languages and accents.”
Google’s new policy is that audio data from existing users not be included in any human review process unless users reconfirm this setting on their devices. During the transcription process itself, audio recordings are not associated with any user account.
Google suspended its transcription programs in July 2019 after a reviewer leaked confidential Dutch audio data. Google was in good company. Fellow tech giant Apple discontinued its own transcription practices in August 2019.
The decision was prompted by a Siri grader leaking internal documents, which led to an unflattering article in The Guardian. The newspaper suggested that contractors were listening in on users (e.g., conversations between patients and doctors, private business negotiations) without user consent.
While Slator was not able to independently verify a connection, privacy issues may also have been behind the termination of a linguistic testing contract for Welocalize, which laid off 100 workers in August 2019.
Despite the fallout from these data breaches, some companies have capitalized on the race to improve AI by specializing in human-assisted audio transcription and review.
Slator 2019 Language Industry Market Report
Startup Scale AI, whose valuation surpassed USD 1bn in a Series C round in August 2019, works with over 30,000 contractors to provide a range of services, including text classification, speech and voice transcription, and OCR (optical character recognition) transcription services.
As for storing all this data, Seattle-based Jargon offers a content management system for voice applications. The platform received funding from Amazon in 2018 and raised USD 1.8m in seed funding in March 2019. And then there’s Australia’s Appen, of course, which has been the dominant player in this area for some time.