3 years ago
February 21, 2019
How Language Featured at the 2019 Applied Machine Learning Days
From January 26–29, 2019, the Swiss Federal Institute of Technology in Lausanne (EPFL) hosted over 2,000 participants at the 2019 edition of its Applied Machine Learning Days (AMLD).
Co-organized by EPFL Professors Marcel Salathé, Martin Jaggi, and Bob West, AMLD 2019 ran nearly 20 tracks on a wide range of areas such as AI & Cities, AI & Finance, AI & the Molecular World, and AI & Language.
The AI & Language track was organized and hosted by Adam Bittlingmayer, Co-founder of machine translation risk prediction startup ModelFront. The track featured more than a dozen researchers, engineers, and startup CEOs and CTOs in back-to-back, 10-minute presentations and panel discussions. The discussion centered around four main themes; (1) representations, interpretability and visualization, (2) engineering and tools, (3) machine translation, (4) products and startups.
Asked why AMLD 2019 decided to run a language track, Bittlingmayer told Slator that “language is an application area, but it’s also a core field of machine learning, like vision. Unlike vision, it’s something only humans do. It’s highly intertwined with human intelligence. There are great research conferences like NIPS and EMNLP, but people who are actually building and launching something want a more efficient format and a balance of research and engineering.”
Indeed, in contrast to EMNLP’s near-exclusive focus on academic research, the AMLD lived up to its name and featured numerous startups and corporations applying and commercializing the fast-growing body of research emerging from academia.
The founder of OTO.ai, Nicolas Perony, for example, explained how they are working on a system that builds on the latest advances in natural language processing (NLP) to help call center agents in customers calls.
Corti.ai Co-founder and CTO Lars Maaløe demonstrated how their AI listens to and learns from emergency calls, and “instantly matches live audio with thousands of past calls, supplementing emergency dispatchers with superhuman pattern recognition.” AI-powered triage, so to speak.
What’s Your AWS Bill?
Before Corti and OTO, representatives of two well-known language industry startups took to the stage. Lilt’s Director of Research Joern Wuebker used his 10 minutes to walk participants through the Silicon Valley startup’s vision of enabling translators to work more intuitively with machine translation. He also spoke on how they are building their machine learning models on top of the Tensorflow framework.
Representing another well-funded language industry startup, João Graça, Co-founder and CTO of Unbabel, talked about why they decided to focus exclusively on MT-supported human translation of customer service content. Graça also touched on the buy-versus-build question in today’s fast-evolving tech ecosystem (“What’s your stack like? What’s your AWS bill? What do you buy and what do you do yourself?”).
“AMLD is first and foremost a machine learning event, and both Lilt and Unbabel are machine learning companies at their core, and were since day one, across the founding team” language track organizer Bittlingmayer explained when asked why they picked the two companies. He added, “They’ve also reached a stage where they can share insights.”
“The space is growing and the playing field has really opened up. There are now great translation companies in China. There are also Yandex and DeepL in Europe. Companies like Facebook and Amazon also have their own MT. There are also interesting startups like Intento operating up the stack,” Bittlingmayer said.
What areas in language-related AI does Bittlingmayer think will attract the most media and investor attention in 2019 and does he see the buzz around neural machine translation continuing? “Soon we will refer to NMT as simply ‘MT’, just like we did for SMT,” he said.
According to Bittlingmayer, “Jakob Uszkoreit [Google Brain Scientist and a presenter in the Language track] is very right when he says that soon there will be more than just text input. For translation that means the big shift will be when we stop training and evaluating models line by line, but in context — whether that means the input includes the previous line, the whole document, image, audio, or something else.”
He continued, “Neural approaches also introduce new problems for text generation tasks like translation, for example, random output, misleading fluency, and catastrophic forgetting. So I think we will see focus on quality estimation, confidence and risk.”
“That’s the obvious way to fail less on stylistic consistency, anaphora resolution, basic semantic ambiguities, and the AI-hard reasoning challenges highlighted by Vered Shwartz [NLP Lab Department of Computer Science Bar-Ilan University] and her lab,” Bittlingmayer said.
Looking to future AMLD editions, Bittlingmayer hopes to expand the focus on translation and introduce a full-day track dedicated to machine translation.