Natural Language Processing (NLP), Natural Language Understanding (NLU), and Natural Language Generation (NLG) all fall under the umbrella of artificial intelligence (AI).
NLP is a branch of AI that allows more natural human-to-computer communication by linking human and machine language.
NLU processes input data and can make sense of natural language sentences. NLG is another subcategory of NLP which builds sentences and creates text responses understood by humans.
The terms might look like alphabet spaghetti but each is a separate concept. In fact, NLP includes NLU and NLG concepts to achieve human-like processing.
NLP considers how computers can process and analyze vast amounts of natural language data and can understand and communicate with humans. The latest boom has been the popularity of representation learning and deep neural network style machine learning methods since 2010. These methods have been shown to achieve state-of-the-art results for many natural language tasks.
NLP tasks include optimal character recognition, speech recognition, speech segmentation, text-to-speech, and word segmentation. Higher-level NLP applications are text summarization, machine translation (MT), NLU, NLG, question answering, and text-to-image generation. Recent groundbreaking tools such as ChatGPT use NLP to store information and provide detailed answers.
Applications for NLP are diversifying with hopes to implement large language models (LLMs) beyond pure NLP tasks (see 2022 State of AI Report). Felix Laumann. CEO of NeuralSpace, told SlatorPod of his hopes in coming years for voice-to-voice live translation, the ability to get high-performance NLP in tiny devices (e.g., car computers), and auto-NLP.
There has been no drop-off in research intensity as demonstrated by the 93 language experts, 54 of which work in NLP or AI, who were ranked in the top 100,000 most-cited scientists in Elsevier BV’s updated author-citation dataset. Here are some of the best NLP papers from the Association for Computational Linguistics 2022 conference.
NLU can understand and process the meaning of speech or text of a natural language. To do so, NLU systems need a lexicon of the language, a software component called a parser for taking input data and building a data structure, grammar rules, and semantics theory.
NLU’s core functions are understanding unstructured data and converting text into a structured data set which a machine can more easily consume. Applications vary from relatively simple tasks like short commands for robots to MT, question-answering, news-gathering, and voice activation.
Continuous research and funding in NLU can be found from international organizations, startups, and big tech alike.
- The European Union’s European Language Equality project (ELE) aims to achieve deep NLU by 2030.
- The European Commission announced a EUR 20m tender for NLU in December 2022.
- NLP software startup NuMind raised USD 3m in seed funding in March 2023.
- Amazon announced developments towards achieving massively multilingual NLU in May 2022, with their focus being on NLU as a component of spoken-language understanding whereby audio is converted into text before NLU is conducted.
- Google’s PRESTO is a dataset of over 550K multilingual conversations between humans and virtual assistants in six languages, including the same challenges faced in real-world NLU tasks, e.g., disfluencies, code-switching, and user revisions. Uniquely, it includes only conversations by native speakers.
NLG is a software process that turns structured data – converted by NLU and a (generally) non-linguistic representation of information – into a natural language output that humans can understand, usually in text format.
While NLU systems must disambiguate the input sentence and may face ambiguous or erroneous input to produce the machine representation language, NLG systems decide how to put the representation into words and ideas expressed by systems through NLG are usually known exactly.
Common NLG applications include producing weather reports, patient reports), image captions, chatbots, and, more recently, AI writing tools. For these writing systems, users “brief” the tool with key inputs and a written text is generated. Copy AI is one NLG startup commercializing AI for writing and claims their system can write blogs ten times faster, create higher-converting texts, compose emails, product descriptions, sales and website copy., etc.
Slator explored whether AI writing tools are a threat to LSPs and translators. It’s possible AI-written copy will simply be machine-translated and post-edited or that the translation stage will be eliminated completely thanks to their multilingual capabilities.
However, these are products, not services, and are currently marketed, not to replace writers, but to assist, provide inspiration, and enable the creation of multilingual copy.