Industry standards for Translation Memory sharing have caught the eye of public administrations. Benefits such as lower costs, speedy translations and project turnaround have allowed companies to reap the benefits of data sharing in B2B relations. The European Union’s aims to embrace technological development and foster a digital single market have led to recent investments in digital infrastructure, most recently awarding a contract to a pan-European data-sharing solution named NEC TM.
Generally speaking, technological developments and breakthroughs are innate within the private sector. Market pressure and the need to stay competitive encourage aggressive investment in technology. In turn, spurring innovation as a way to boost productivity. However, with the public sector seeking to bridge data gaps and enhance its own automated systems, NEC TM and its goal to provide a fast, cloud-based infrastructure for efficient data sharing, seems a well worth investment.
NEC TM is led by global language and technology company Pangeanic, and its AI unit PangeaMT, with Latvian technology provider Tilde and Croatian LSP Ciklopea as partners. Spain’s State Secretariat for Digital Advancement Agency, SEAD, is also a member and as an implementing body of government policies, SEAD will lobby for the early adoption of the NEC TM platform in Spain. Under the project, unexploited national bilingual assets across member states will be organised, with the aim of using Translation Memories as shareable data and general data for machine learning. This will eventually lower translation costs for public administrations at national levels and for the EU as a whole.
With its 24 official languages, there is no surprise the EU and its member states are huge buyers of translation services, purchasing many millions of euros worth of translation contracts annually. The problem is, a large volume of these contracts do not specify that translation service providers should return Translation Memories to the contracting body; a huge loss of potential data for the public sector. With data collection for machine learning high on the European agenda, an effort to pool and centralise national and European resources will be invaluable.
Following industry best practices, where TM sharing is common (if not contractual), Translation Memories will be gathered from previous national contract awards to the private sector across Member States. Most notably, the NEC TM consortium will provide a solid framework on general data sharing for public administrations to adopt within their contracts as future policy. Here, fuzzy analysis will be executed prior to a translation contract is issued, being explicitly detailed within public sector translation RFPs.
Another crucial aspect of this programme, is the strive to promote better data-sharing practices; currently, public administrations lack awareness on the benefits of open data and how language, specifically TMs, can be capitalised as a resource for AI. Through its pan-European data-sharing awareness campaign, and its forthcoming two White Papers, the programme aims to engage public administrations across the whole of the EU. The White Papers will serve as proof of concept and seek to to expose the costs of public translation contracts from the past four years, along with potential savings. The study will consider the whole of the EU, examining each country at three levels of administration (national, regional, local). To maximise its potential, the NEC TM project will pursue collaboration with other EU initiatives such as ELRC-Share, a repository used for documenting, storing, browsing and accessing language resources.
Undoubtedly, Artificial Intelligence and Machine Learning have become key drivers of global economic development. The European Commission, through its Connecting Europe Facility (CEF), continues to invest in projects that complement data centralisation and sharing efforts in pursuit of maximising AI digital economies and European integration. Thus spreading awareness on the importance of data collection, efficient data-sharing practices and the re-use of data for machine learning is invaluable. The NEC TM project is an important step in this direction.