- Neural MT shows most promise for improving fluency of complex languages.
- Joint research results find neural MT has not reached the quality of SMT and hybrid approaches for certain languages and content types.
- Iconic to focus efforts on breakthrough customisation technologies for neural MT.
Dublin – May 23, 2017 – Iconic Translation Machines (Iconic), a leading Machine Translation (MT) software and solutions provider, today announces its participation in the 20th Annual Conference of the European Association for Machine Translation (EAMT) taking place in Prague, May 29-31. The prestigious event will be held at the Faculty of Mathematics and Physics at Charles University in Prague. The conference showcases the latest in academic MT research and development, and features presentations examining the best practices from the global localisation industry.
Iconic and the ADAPT centre, a world-leading research centre based in Ireland, will present the findings of a joint research project carried out on Neural Machine Translation (NMT), including a comparative evaluation of NMT engines with Iconic’s existing custom production MT engines. Human evaluation was also incorporated into the study to compare and contrast the benefits and disadvantages of the various technologies.
The research report findings suggest that there is cause for great optimism given results published to date for NMT, but that there are still many cases where existing Statistical Machine Translation (SMT) and hybrid approaches achieve better quality for certain language and content types, based on human evaluations.
The research was carried out as part of broader efforts in the EU-funded TraMOOC (Translation for Massive Open Online Courses) project, a Horizon 2020 collaborative project aimed at providing reliable MT for MOOCs. Results on assessments of NMT output on user-generated content from eBay were also described.
Iconic’s existing production MT engines are based on a proprietary Ensemble Architecture™ which combines elements of phrase-based, syntactic, and rule-driven MT, along with automated post-editing. The engines have been highly tuned for the patent domain, using multiple different translation and language models, and to incorporate content-specific terminology.
The ADAPT/Iconic NMT engines were implemented using an ensemble of attention-based models trained on different combinations of in-domain and general data. Iconic’s NMT was customised to include a module for handling terminology, as well as other practical considerations such as domain and content adaptation for different document sections. The focus of the evaluation was on Chinese to English translation of patent information.
Key research findings include:
- NMT has the capability to greatly improve the fluency of MT output, particularly for complex language pairs.
- Hybrid/SMT engines, when highly tuned for specific content types, can still produce better output than NMT engines.
- NMT output contained fewer word order errors and few inflectional morphology errors in all target languages, leading to greater fluency.
- NMT engines produce peculiar errors, such as omitting parts of sentences, which are difficult to predict and resolve.
- Hybrid/SMT was stronger at handling terminology and producing error-free output for certain types of segments e.g. patent titles.
These findings strengthen the belief that more research is required to further address the practical limitations of NMT, which Post-doctoral Researcher at ADAPT Dr. Shelia Castilho points out:
“The findings show that, while NMT is really promising particularly in terms of fluency, there is still a lot of research and evaluation required for certain languages and domains before it can outright replace the current state-of-the-art.”
The results have further supported Iconic’s recent development of incorporating NMT into its proprietary technology only in cases where it can provide significant benefit for commercial applications in the short-term. Iconic combines the benefits of NMT with the suite of its own existing technology components and expertise in order to overcome any NMT shortcomings.
NMT has demonstrated the capablity to handle the complexities of languages that are difficult for MT such as Korean and Japanese by default, but there is still not as much flexibility for customisation and importantly, directly addressing errors in the output.
“Iconic is now focusing significant efforts on developing new technologies for the customisation of neural MT,” said Iconic CEO, Dr. John Tinsley. “The capability to build customisation into the neural MT process will allow us to reach unrivalled quality levels and position us steps ahead of existing systems.”
About ADAPT Centre for Digital Content Technology
ADAPT, the Centre for Digital Content Technology, provides a partnership between academia and industry in the field of digital content technology, leading on ground-breaking innovations in areas such as localisation, social media analysis, multimodal interaction, intelligent content and media, and informal and formal learning. Funded by Science Foundation Ireland, the centre is led out of Trinity College Dublin and combines the world-class expertise of researchers at Dublin City University, University College Dublin and Dublin Institute of Technology. www.adaptcentre.ie
About Iconic Translation Machines
Iconic Translation Machines is a leading machine translation software and solutions provider who specialise in custom solutions tailored with subject matter expertise for specific industry sectors including legal, life sciences and financial services. Iconic is the MT partner of choice for some of the world’s largest translation companies, information providers, and government and enterprise organisations, helping them to translate more content, more accurately and in less time, resulting in significant cost savings and increased revenue. Iconic is based in Dublin, Ireland. www.iconictranslation.com
Diane O’Reilly, Global Head of Sales & Marketing, Iconic Translation Machines