Machine Translation Left Unaddressed by EU in Proposed AI Legislation

European Union Regulation on Artificial Intelligence

The European Union released a long-awaited proposal on AI legislation on April 21, 2021. European Commission EVP Margrethe Vestager described the legal framework as “the first [of its kind] on the planet.”

The very few use cases for which an outright AI ban is proposed include social credit systems, “subliminal” techniques to manipulate people’s behavior in harmful ways, and general police use of real-time “remote biometric identification systems” in public places — although judges may approve exemptions.

The legal framework focuses primarily on AI systems considered high risk, with the potential to significantly impact people’s lives; for example, algorithms that determine credit scores or that control automated machinery and vehicles.

Existing EU product safety legislation may already apply to many high-risk AI systems, which will be subject to a series of requirements, such as using “high-quality” training data to avoid bias; incorporating “human oversight” into each system; and documenting, for both users and regulators, how the system works.

The AI systems will also have to be indexed in a new EU-wide database (perhaps inspired by the AI Incident Database). To add some teeth to the legislation, violators face fines of up to 6% of their global turnover or EUR 30m (USD 36m), whichever is greater.

The proposal’s broad guidelines do not mention machine translation (MT) explicitly, so language service providers (LSPs) and end-users alike may need to read between the lines to understand their potential future obligations.

Chatbots and products of natural language processing (NLP) are considered “lower risk.” As such, these must simply inform users that they are interacting with a machine.

Content generated by language models, such as OpenAI’s massive GPT-3, may be subject to similar requirements. Several writing tools have already been developed using GPT-3, including AI21 Labs’ Wordtune and an application by OthersideAI that expands bullet points into paragraphs.

It is not clear, however, whether the proposal applies to less interactive content, such as transcripts generated by automated transcription programs. In practice, many companies that use automated transcription already include a disclaimer to address possible typos and other errors.

Deepfakes: Not Banned, Just Regulated

The potential for risk grows with synthetic voice production, a process several companies in the language industry are working on for dubbing; that is, “teaching” a person’s voice to “speak” in another language. (See: synthetic dubbing)

It is not difficult to imagine how such technology could be misused. And yet, as Mark Leiser, an assistant professor at Leiden University’s Center for Law and Digital Technologies, pointed out on Twitter, Deepfakes “are not banned, but regulated for transparency” in the proposed framework.

Synthetic voices may not yet pose the same kinds of risks as visual deepfakes.

Jesse Shemen, CEO of Papercup, a startup offering “video translation with human-sounding voiceovers,” recently explained to Slator that although the field has grown by leaps and bounds, “there is still a large amount of technical progress and change that needs to be made in the world of text-to-speech; and, as a result of that, the commercial applications are still incredibly early in terms of how speech can be exploited.” (That said, it has not stopped Apple from a hiring spree that may or may not be related to machine dubbing.)

SlatorPod – News, Analysis, Guests

The weekly language industry podcast. On Youtube, Apple Podcasts, Spotify, Google Podcasts, and all other major platforms.

SlatorPod – News, Analysis, Guests

It remains to be seen how, if at all, human involvement in the AI workflow will affect the new requirements.

If MT text must be labeled as such, would the label still apply if text is post-edited by a human linguist? And how might that rule change as MT’s capabilities expand and diversify? Amazon is already exploring automated quality checks for subtitles; and Lilt CEO Spence Green told Slator that his company is currently working on automated MT review.

The EC’s Vestager has described the need to address AI risks as “urgent.” But, by the time the European Council and Parliament approve the proposed legislation, both the AI landscape and the language industry may look very different.