Following Facebook F8 2019 where a third of CTO Mike Schroepfer’s keynote was devoted to natural language processing (NLP) and, to a lesser degree, neural machine translation (NMT), Google and Microsoft did not have anything big to share regarding language technology at their recent developer conferences.
They did, however, announce some relevant developments in NLP and automated translation along the lines of future integration and expansion. Google I/O 2019 and Microsoft Build 2019 both took place the week of May 6, 2019.
Google I/O 2019
At I/O 2019, Google said it will add translation capabilities to Google Lens, which is basically a smart camera app with image-recognition capabilities. Android users can activate Lens via their smartphone cameras and use Google’s image-recognition technology in conjunction with its image search and general search capabilities to get information on whatever they point the camera at.
Lens already has optical character recognition (OCR), so integrating Google Translate seems like an obvious next step. The Google Translate app can basically do the same thing, but what it does is translate the text it reads in photos via OCR. Translation in Google Lens seems to offer a bit more, including a text-to-speech feature that lets Android users make their phone “read out” what they want it to translate into their language.
Meanwhile, Google Assistant, which already comes packaged with Google Translate capability — and, more recently, Interpretation Mode — is coming to the Waze GPS navigation app. Speaking of intelligent assistants, Google’s AI-powered intelligent customer service tool Duplex is coming to the Web, starting with narrower use cases like booking a car and buying movie tickets. It will gradually roll out more features.
Finally, Google is adding live transcription and captioning capabilities to Android to assist the hearing-impaired. The tech stack is straightforward: NLP-based, speech-to-text that allows the Android OS to listen to its surroundings and transcribe any speech it hears. No word yet on multilingual capability.
Microsoft Build 2019
Real-time transcription was also featured at Microsoft Build 2019. The company showcased its live transcription capabilities and how it can now learn industry-specific jargon for the medical and coding fields.
Microsoft previewed to a limited audience how its Conversation Transcription system can pick up multiple speakers even as they talk over one another, and can even use a combination of audio and video to identify who is speaking. Conversation Transcription is currently in partnership with providers such as Accenture, Roobo, and Avanade for commercialization.
In the smart assistants arena, Microsoft’s Cortana is being touted as a multi-domain, multi-platform helper, with a focus on a more conversational voice interface through the addition of something akin to short-term memory.
Cortana will consistently remember things about a conversation throughout the interaction. For example, if a user brings up a previously discussed meeting, Cortana will understand what is referred to and will be able to carry out commands or ask follow-up questions related to the meeting.
Meanwhile, the core language recognition technology that powers Cortana will also come to BMW’s Custom Virtual Assistant. During Build 2019, the car company demoed its virtual assistant that companies can also whitelabel for their purposes. Additionally, third-party developers can take advantage of the conversational engine that powers Cortana through Microsoft Bot Framework and Azure services.
A little bit of NLP is also coming to Microsoft’s office applications. First, document authors accessing and editing the same document via the Microsoft 365 Fluid Framework will also be able to use translation capabilities at the same time. Second, Microsoft Word is getting a feature called Ideas, which is a grammar and style checker powered by machine learning.
Some commercial applications of Microsoft’s language technology were also on display, namely Chinese tech firm Cheetah Mobile’s global version of its translation device, CM Translator, which runs on Microsoft Azure Cognitive Services.