Just How Big Is the Natural Language Processing Market?

Innovation in natural language processing (NLP) affects the language services sector in a number of important ways. Machine translation, computer-aided translation tools, predictive typing, and more all rely on advances in NLP.

NLP started as early as the 1950s with successive breakthroughs in the late 1980s, and more recently with the arrival of truly big data. Ironically for a field that attempts to teach computers the meaning of language, there is no generally accepted definition for the term NLP itself, which, in turn, results in widely differing estimates of NLP’s market size.

Last year, MarketsandMarkets projected the global NLP market to reach USD 13.4bn by 2020. Early this year, Technavio said it expected the global NLP market to top USD 11bn by 2019.

But now along comes Tractica forecasting total revenues for the NLP sector to increase from USD 277m in 2015 to USD 2.1bn by 2024. When did the NLP sector become so tiny―or so big, gauging by the two earlier reports?

The gap in figures appears to stem from the loose (or narrow) definition of what constitutes NLP, and which businesses actually drive it.

The copiously named MarketsandMarkets appears to have the widest, most segmented definition of what drives NLP growth. In their study, they divide the NLP market by type (rule-based, statistical, and hybrid NLP), by technology (recognition, operational, analytics), by service (professional, support, maintenance), by deployment (on-premise, on-demand), by application (MT, data extraction, report generation, question answering, etc.), by vertical (8 specific + 1 catchall “others”), and by region. Basically, they look at everything to make their projection.

MarketsandMarkets projected the NLP market to “grow to USD 13.4bn by 2020 at a CAGR of 18.4% for the forecast period 2015–2020.” Computing backward, that means NLP currently makes USD 5.76bn in revenues.

Technavio segments the market by end-user, but includes only five broad categories: healthcare, e-commerce, IT and telecom, BFSI (Banking, Financial Services, and Insurance), others. Their NLP market forecast: USD 11bn by 2019 at a CAGR of over 16%. Compute back and that means the NLP market generates revenues today of around USD 6bn per Technavio—not too far from where MarketsandMarkets’ projection places current numbers, despite Technavio’s narrower definition of NLP.

Now, with Tractica pegging 2015 NLP revenues at a mere USD 277m, what they include as NLP drivers would probably be much narrower, right? Not so. According to the summary provided by founder Clint Wheelock, to whom Slator reached out for comment, Tractica defines NLP as “an umbrella term that can be applied to a diverse set of computer applications.”

In their forecast of a USD 2.1bn NLP market by 2024, Tractica considered several industry verticals (three where NLP is already a competitive advantage, six where it likely will be, and six not so much) then added total revenues from software, hardware, and services.

Wheelock disclosed that, although the new Tractica report “touches on machine translation several times as a key use case for NLP,” they have “not sized or forecast that portion of the market as a discrete segment.”

We communicated with NLP industry analyst and consultant Seth Grimes and asked him to comment on the wide disparity in market estimates.

According to Grimes, “Calculating the contribution of NLP, within the value or revenue of a larger product or service that applies NLP…is a challenge.”

Grimes explained that broadly applied tech, which includes NLP, may be “too new or too narrow to have drawn the full attention of the big-firm analysts.”

He also mentioned the issue of market maturity. Grimes said that while the business intelligence market, for instance, has been around for 30 years, the first significant commercial use of text analytics (which applies NLP) only began around 2004. He further pointed out that “newer, more specialized analytical technologies aren’t widely built into everyday business operations despite their ability to transform business-stakeholders’ interactions.”

So what should NLP growth projections exclude? According to Grimes, he would exclude, “academic, government, and industrial research,” explaining that “work doesn’t contribute to a market valuation unless something—a product or service—is sold.”

Additionally, he would also leave out “the sometimes-substantial value of work done for in-house use, for instance text analytics done by companies such as Thomson Reuters or Reed Elsevier in the course of creating information products.”

Grimes thinks that exclusion has gone a step too far at Tractica, however. He said the Tractica report includes personal assistants like Apple’s Siri, but points out “There were 231 million iPhones sold in 2015. They sell for several hundred dollars each. If the value of the NLP within Siri is worth USD 0.25 for each unit sold, then you’ve already generated USD 56m in NLP value. But consider also that Siri and other systems use voice technology from Nuance, which makes speech, text, and imaging tech. NLP is at the heart of much of Nuance’s product set. Nuance’s 2015 revenues were USD 1.93bn.” Those numbers alone would exceed Tractica’s total by a factor of almost seven.

Grimes also questions Tractica’s selection of 20 key industry players, wondering “how Tractica came up with an estimate for companies such as Google (NLP is essential for search) and IBM (since NLP is a key Watson component).”

He described the list as “woefully incomplete,” highlighting that it includes “at least two companies that are among the smallest in the field, Aylien and Genee, as well as several smallish companies.”

He added that while Tractica’s list includes companies like BirdEye, which “are small and seem only peripherally involved in NLP,” it is missing companies like HP, Facebook, SAP, and more than 100 other companies that Grimes said he is tracking.

In a Slator article earlier this month, Grimes pointed out that “NLP is at the core of just-about everything Google and Baidu do and much of Facebook’s, IBM’s, Amazon’s, and Microsoft’s businesses.”

In the end, widely divergent market figures indicate a space that is far from mature. This, in turn, means innovation will continue at a fast pace.