How to Use AI in Translation

How to Use AI in Translation

An increasing number of cloud-based translation management systems (TMS) and computer-assisted translation (CAT) providers are incorporating artificial intelligence (AI) into their platforms, particularly in the translation/machine translation (MT), editing, and proofing/QA workflows for all types of content.

Large language models (LLMs), which are a type of AI mostly developed with natural language datasets, can understand and generate human-like output. Most LLMs are also capable of translation, depending on how they were developed.

The open versions of LLMs that can translate, such as generative AI chat interfaces like ChatGPT, BARD, and HuggingChat, are great for non-commercial purposes but do not safeguard confidentiality.

When it comes to business applications, LLMs require customization or integration with other tools such as TMS and CAT. Used properly, LLMs can lead to great productivity gains at language services providers (LSPs) and internal localization departments. 

Something to keep in mind is that LLMs differ in myriad ways, including number of tokens (the units of text or code) used to create them (and that are indicative of their capacity), languages supported, the way they were trained (where the units of text or code came from), the way they handle text inputs, and levels of quality output when performing tasks that include translation.

In this resource, we examine some of the practical ways in which LLMs can be applied to enhance translation productivity and management, from individual use cases to scalable professional translation workflows.

Translate Text-Based Content Directly

First, be aware that none of the options described in this article result in a final, 100% perfect translation. For that, an expert linguist must always review the LLM’s output (the “target” language) and ensure accuracy, integrity, and overall linguistic quality.

There are as many options as there are translation needs. If an individual just wants to understand text-based information in an unknown language (the “source” language), one of the easiest ways to get a good idea of what it says is to run it through “good old” MT, such as Google Translate. However, you are likely to get a better translation out of an LLM because the data used to create these AI tools is much more comprehensive than the data used to train an MT engine.

There are a few options to translate text, depending on the LLM used. One is an integration of the LLM into a user interface, such as adding Open AI’s subscription-based GPT-4 (aka ChatGPT Plus) to Google Workspace as an extension to translate documents and spreadsheets, as seen in the abbreviated setup sequence below.

Another option is to use the free chatbot version of the LLM and copy and paste text onto a dialogue box using meaningful prompts (instructions for the AI) to get better results (e.g., “Translate the text below into [language]. This is a [describe the subject matter of the text].

In the example pictured below, the prompt used was “Translate the following text in quotes from English into French. This is a set of instructions for use for a medical device. Translate using a 6th grade reading level,” which specified the subject matter and the target language reading level (the standard reading level for patient literature in the US). 

It is also possible to upload entire documents for translation into some cloud-based LLMs. In ChatGPT, you need to first enable the upload functionality using an extension or plugin, such as ChatGPT File Uploader Extended (for Chrome). The Plus version offers the functionality as a standard feature and renders overall editable quality target language results.

Customize an AI-Enabled Translation Tool

A computer-assisted translation (CAT) tool works by aligning source and target language texts translated by a human or machine (or a combination of both, such as in Trados Studio, Phrase, and BWX, to name a few). An increasing number of CAT tools are also integrating AI into their functionality, including:

  • AI-enhanced machine translation
  • AI-enabled linguistic quality assurance
  • AI-enabled assistants (aka AI copilots)

For professional/business uses (by linguists or other translation professionals) and to preserve confidentiality, an AI-enabled CAT tool is a better alternative to straight [open] LLM translation. Some of the reasons are that AI-enabled CAT tools leverage previously approved translations from translation memory (TM) as well as valid terminology, when available, combines TM and human translations with one or multiple MT outputs, and allows users to customize workflows.

For example, in BWX, users can choose to apply translation memories, term banks, and have one or multiple language workflow steps (e.g., translation, review, proofing, etc.). The Editor window in BWX shows the source and target languages side by side, as well as a pop-up menu showing the AI-enabled functions available during the linguistic phases of a translation project.

In the Phrase Localization Suite, commercial users can use Phrase Custom AI to train their own language models and adapt their machine translation engines for different purposes (called “fine-tuning”) and their terminology to obtain more contextually precise or domain-specific translation results.

Phrase Language AI, the CAT tool in the localization suite, uses AI to automatically leverage, then improve TM and managed-MT translation outputs. It also has an AI-only translation output for human post-editing. For translation project workflow management, the TMS platform allows customization of different steps and linguistic assets.

The AI options in the Phrase TMS platform appear on a pop-up menu similar to those seen in customer service bots.

Use AI-Enabled Dubbing and Subtitling

AI-enabled media localization is fast becoming a preferred choice for non-cinematic productions such as training and marketing videos, as evidenced by the now standard availability of tools like Aloud on YouTube for content creators. 

An increasing number of startups are offering easy and affordable access to AI-enabled dubbing, lip-syncing, and subtitling. One of the foundations of generative AI is the ability to continuously learn as data is improved, and the technology just keeps getting better, with many viral multilingual videos made by regular people landing on people’s devices. 

Dubbing offers like those from Rask.ai, Eleven Labs, and many more are making it possible, and are creating new opportunities for companies to reach new audiences where pricing and complexity used to be barriers to market. 

AI-enabled subtitling integrates speech recognition and MT, and has greatly improved as MT itself improved. Providers like AI-Media have brought a great deal of automation to the entire process and offer live text translation. This technology can be useful for live online events, such as multinational company meetings or shareholder broadcasts.

—  

For more information about incorporating AI into your translation processes, see Slator’s Translation AI Pro Guide, where many more actionable use cases for AI in translation are discussed in detail.