EU Awards USD 4.8m in Neural MT Grants for Low-Resource Languages

On February 27, 2019, the Connecting Europe Facility Telecommunications (CEF Telecom) sector released the list of selected proposals in the areas of Automated Translation, eDelivery, and eInvoicing for their 2018 work program. A total of over EUR 4.3m (USD 4.8m) has been earmarked for five separate project proposals on Automated Translation.

CEF Telecom opened a call for proposals in July 2018, making available a maximum grant of EUR 5m per selected project. The call for proposals aimed to further improve the capability of the European Union’s neural machine translation (NMT) facility, eTranslation, in translating low-resource languages.

Selected proposals are expected to implement NMT engines, collect more training data for high-priority, low-resource languages, and assist in the integration of the eTranslation facility into the EU’s other online services that may require multilingual functionality.

Automated Translation

CEF Telecom reportedly received 35 project proposals in response to the call and ultimately selected five for Automated Translation, three for eDelivery, and nine for eInvoicing. The total proposed maximum funding allocated for all 17 exceeds EUR 10.3m (USD 11.6m).

The five project proposals selected for the Automated Translation section of the work program and the maximum amount of funding earmarked for each are as follows:

ProposalCoordinated byRecommended Funding (EUR)
PRINCIPLE – Providing Resources in Irish, Norwegian, Croatian and Icelandic for Purposes of Language EngineeringDublin City University1,138,781
OCCAMBrno University of Technology973,894
EuroPat: Unleashing European Patent TranslationsUniversity of Edinburgh695,890
Continued Web-Scale Provision of Parallel Corpora for European LanguagesUniversity of Edinburgh889,649
Translation Automation Services for EU Council PresidencyTilde620,800

Four language service and technology providers are involved in the listed projects: Iconic Translation Machines, Prompsit, Omniscien Technologies, and Tilde.

PRINCIPLE is a consortium among Dublin City University’s ADAPT Centre, Dublin-based MT provider Iconic Translation Machines, the National Library of Norway, the University of Zagreb, and the University of Iceland. According to an article on the Iconic website, the consortium partners “will collaborate closely with government agencies, national public administration bodies and data holders in Croatia, Iceland, Ireland and Norway to ensure that the pooled language resources appropriately cover the technology needs.”

The “Continued Web-Scale Provision of Parallel Corpora for European Languages” proposal is a continuation of previous, similarly titled, projects by the same consortium, which includes the University of Edinburgh, the Universitat d’Alacant, language industry organization TAUS, Spain-based ICT company Prompsit Language Engineering, and Singapore-headquartered language technology company Omniscien Technologies.

Meanwhile, Latvian MT company Tilde is responsible for Translation Automation Services for the EU Council Presidency, which provides the eTranslation platform with NMT for Bulgarian.

Millions of Euro into MT

Launched in November 2017, the eTranslation facility is the NMT replacement of the EU’s previous statistical MT solution called MT@EC. CEF said eTranslation is “trained using the vast Euramis translation memories, comprising over 1 billion sentences in the 24 official EU languages produced by the translators of the EU institutions over the past decades.” Euramis stands for “European advanced multilingual information system.”

MT plays a pivotal role in the EU’s efforts to overcome language barriers in its vision of a Digital Single Market. According to the Language Technologies policy of the EU’s Digital Single Market strategy, current MT solutions available on the market “usually don’t reach the required levels of quality, or only for [a] limited number of languages, text types or topics.”

To address this, Horizon 2020 — the biggest EU research and innovation program with a budget of EUR 80bn available from 2014 to 2020 — has been funding various MT undertakings, such as a EUR 3m project for the automated translation of Massive Open Online Courses.
At the same time, CEF “addresses the additional challenge of creating a complete infrastructure for language resources and processing tools.” CEF has funded efforts such as a EUR 5.8m initial tender to build a standalone Automated Translation service building upon MT@EC (at the time), a EUR 1.9m project to customize MT for state and regional authorities, a EUR 2.4m research project for automated translation and, now, this EUR 4.3m for low-resource languages.