On December 19, 2016, a Monday, at exactly half past nine, the Twitterverse was alerted to the existence of the OpenNMT project over at the Harvard natural language processing (NLP) group.
The Harvard NLP group comprises researchers who cover areas as varied as “computational models for human language,” machine learning, deep learning, artificial intelligence, and the “intersections between computer science and linguistics.”
The group’s OpenNMT tweet was followed the day after with a wink at Google, which read: “#Google, we promise we are not #taking you on. Please keep on putting out awesome research / feeding my grad students.”
OpenNMT developer Yoon Kim is a Computer Science PhD candidate and member of Harvard NLP. Kim had previously taken his Master’s in Data Science from New York University, another Master’s in Statistics from Columbia University, and baccalaureate in Math and Economics from Cornell.
Working on the project with Kim was his adviser, Alexander Rush, who runs the NLP group. Commercial machine translation provider Systran, which recently launched its own proprietary neural machine translation system, was also involved in the project.
What follows is Slator’s interview with Harvard NLP’s Alexander Rush and Systran CTO Jean Senellart on the OpenNMT project.
Slator: What motivated you to develop OpenNMT? How did this project come about?
Alexander Rush: The project is based on research software built by my graduate student Yoon Kim. We used the software in my lab to do research on improving translation systems and to teach graduate students. We happened to also put the software online for free, and Systran found it. It was useful for their products, and so they begin to send us updates to the code. It is the kind of mutually beneficial relationship that open-source communities can produce.
Slator: What exactly is OpenNMT and what does it do?
Rush: Recently, there have been a series of advances in artificial intelligence (AI), leading to improvements in speech, image recognition, and game playing. In the area of natural language processing, these improvements have been most impactful in the area of translation, leading to models that significantly improve on the quality of machine translation.
OpenNMT is open-source software implementing this technology, roughly similar to Google’s proprietary system. It is software to learn models for machine translation. It takes in a corpus of aligned sentences from a source and target language, and learns a mathematical model—known as a neural network—to [perform] translation. That model can then be fed unseen source sentences and OpenNMT will translate them.
We do expect some competitors quickly building products based on this technology—Jean Senellart, Systran CTO
Slator: What makes it different from the commercial solution Systran offers?
Jean Senellart: The core technology we propose to our users will be exactly the same as the one we are contributing for the OpenNMT project. Our business model is to build tailored [products] for our customers. [We] provide complete translation workflow; more features (e.g., document filtering, coupling with other technologies like language detection, entity extraction) than just the core translation.
Slator: Can you give us a simple first use case for OpenNMT?
Rush: We released several example translation models (e.g., German-English). Anyone can download and run the model to experiment with neural machine translation. We publicized the project because we thought it was quite stable; but also with the hope that more people in the translation community would contribute back to further improve it.
In theory, anybody could rent a server and train a model on available data, and we see some hobbyist doing just that—Alexander Rush, Assistant Professor Harvard School of Engineering and Applied Sciences
Slator: What is your mid- to long-term goal for OpenNMT?
Rush: There are two main focuses. One, we want to keep the code up-to-date with all the new ideas published in the research community, such that the open-source software stays competitive with closed-source offerings (e.g., Google). For instance, my group recently developed a system for shrinking translation models so they can run much faster, and this was implemented in the software even before the paper was published.
Two, we want to try out more cutting-edge “translation” ideas. For example, we are implementing an extension to map from images-to-text using OpenNMT. This is a rather recent research idea that we hope to make more accessible.
Senellart: On Systran’s side, we want this project to contain all the best of breed features and ideas that are published by the research community, but also keep the code simple, fast, so it becomes a reference for anyone wanting to do more research or even create commercial applications.
Slator: Who do you see as early adopters of this technology?
Rush: Great question! In theory, anybody could rent a server and train a model on available data, and we see some hobbyist doing just that. In practice, we expect a mix of researchers studying how to improve translation and people in the industry looking to become familiar with new AI technology.
Senellart: We do expect some competitors quickly building products based on this technology—and this will, of course, be challenging for us. But at the same time, [it is] quite an achievement that will help develop the machine translation market and global awareness about the technology.