Product launches are a dime a dozen. They often have little practical relevance beyond an immediate user base. The release by Systran of what it calls Pure Neural MT on a demo platform is somewhat different, however.
As far as we are aware, it is the first time access to neural machine translation for a variety of language combinations is offered to the general public. A clever marketing stunt. Others, such as Iconic Machine Translation, have announced upcoming NMT integrations into their commercial platforms.
Google’s NMT, meanwhile, released to much fanfare recently, but the initial launch was restricted to Chinese into English. So, for the world’s six billion non-Chinese speakers, there was only so much fun they could have playing around with it.
Systran’s demo platform now offers over 30 language combinations with another half-dozen soon to launch. In a technical paper accompanying the launch, Systran outlines the genesis and inner workings of its NMT system; which Systran says is based on an open-source project initiated by a natural language processing group from Harvard University.
This project is implemented “on top” of the scientific computing library called Torch, whose “maintainers” call it “a scientific computing framework with wide support for machine learning algorithms that puts GPUs first.” Other such deep learning frameworks, which have gotten traction in NMT, are Theano and Google’s TensorFlow.
Systran, a Korean-owned company of French heritage, is clearly excited about the launch. Talking to Slator in early September 2016, CTO Jean Senellart said the company noticed the enthusiasm around NMT building up in academia and “decided to jump” around two years ago. He said they now have a dozen PhDs working on NMT in their Paris office, with an additional batch of engineers integrating the tech into the company’s commercial solutions.
So how good is the engine? Systran makes an effort to avoid the kind of hyperbole attached to Google’s NMT launch and tries to preempt critical comments by the translation quality police. The technology “will not replace human translators” and “does not produce translation which is almost indistinguishable from human translation,” the company says in a blog post.
The most salient error comes from missing words or parts of sentence—Systran paper on NMT
In terms of quality, that all-evlusive concept in translation, Systran’s NMT confirms findings seen in other models and trials. The system struggles with very short and very long sentences. It occasionally misses words or parts of a sentence. It also makes some rather strange errors (translating Steve King into Michael King when we tried it on English into German).
However, there is an actual improvement in fluency, even to the casual tester. For an in-depth review of the system, head over to Kirti Vashee’s post.
Systran CTO Senellart told Slator that there are indeed new types of mistakes and that German into English is actually lagging in terms of quality. Interestingly, he says they had better results of what he calls “very complicated languages” like English into Korean.