A large database of multilingual speakers all plugged into a cloud-based platform. Sound familiar? LSPs long ago mastered the art of managing a huge, globally-distributed workforce and leveraging independent workers to deliver content to end-customers across multiple sectors.
The platform bit is more recent, granted. But in today’s landscape of LSPs, many have developed or embedded platform-based workflows to automate their operations (at least partially).
Therein lies most of the mystery of the data-for-AI market: crowd management and platform-based delivery. Of course that may be an oversimplification — for one thing, customer contacts for data services are not the same people as those who run translation and localization procurement. But the expertise that LSPs have acquired in managing thousands of distributed freelance workers makes their positioning extremely competitive in pivoting to the provision of data-for-AI services.
Why would one bother to do so? Data-for-AI is a fledgling but thriving market, and one which is projected to grow to USD 3.5bn by 2024 from ca. USD 1.5bn in 2019. The most glaring LSP success story in the data-for-AI market is Lionbridge, which just managed to sell its AI division for a cool USD 935m. Not forgetting that many in the data-for-AI space have held up better than the rest faced with disruption from the Covid-19 pandemic.
Slator spoke to half a dozen experts from LSPs that already made the transition. From these interviews, we compiled insights into why workforce management is so essential to data-for-AI services and might just be the golden ticket for LSPs to enter the high-growth niche. Here are some snippets from what they told us.
The premise is this: The initial training of AI models is dependent on many hours of human labor. Underpinning data-for-AI services are massive distributed workforces — humans performing myriad tasks such as —
- recording their own conversations
- ranking search queries
- labeling images
- attaching sentiments to tweets
- transcribing hours of recorded speech
- Creating a large and capable workforce is one of the key challenges in the industry.
TransPerfect VP Global Marketing Ryan Simper told Slator, “Scalability is one of the most important elements in this business, and recruitment is critical to success. It’s probably the single biggest barrier to new players joining the space.”
- Ineffective vetting, recruitment, and people management can yield low-quality data.
While self-service options, such as Amazon Mechanical Turk, offer a massive crowd, building a trained, managed, and high-performing workforce is a different matter. Many AI initiatives fail to result in the desired outcomes due to poor model performance, with one of the underlying culprits being poor quality data. Major buyers seek providers that deliver both volume and quality.
- Automation in the recruitment process is one of the keys to meeting this challenge.
According to Welocalize VP Corporate Development Tuyen Ho, “The challenge is that we are recruiting at a super high scale. How can you recruit, test and analyze throughout the recruitment process? Our solution is to apply a lot of automation, so that for a typical recruitment process, if you needed to identify, say 5,000 vetted candidates, you will initially recruit anywhere from 100,000 or more and move those candidates through the recruitment funnel, including automated testing, to find your 5,000 suitable candidates.”
- Location of workers is a big consideration, influenced by pay rates and client needs.
In terms of the geographical location of workers, Summa Linguae CEO Krzysztof Zdanowski told Slator that Asia is their preferred destination due to the low-paying nature for some tasks. However, geography also depends on the nature of the task and, for speech collection, for example, it will be based on the location of native speakers. “These are low-paid tasks. You’re paying micropayments to people, so probably people in Europe or in the US will not be very likely to go out on the streets and take pictures for a few bucks each or a few bucks for a picture set. Whereas, for people in India, that would be a significant income for them sometimes,” Zdanowski said.
He added: “However, variation is very important for the clients. Some tasks require a variety of disciplines from different geographies. So, there’s a general preference for Asia because it’s cheaper, but client-driven too.”
- Retention strategies help reduce churn and manage the performance of workers.
Following recruitment, companies put in place various training and retention strategies to reduce churn and manage performance. These may include the following:
- ensuring development and career path options are available to workers
- ensuring pay rates are in line with liveable wages
- curating and maintaining teams dedicated to specific long-term clients
- building and maintaining relationships across the workforce to ensure connection across the remote global workforce
TransPerfect’s Simper told Slator, “Our aggressive resource-building strategy is based on providing a compelling career option for people interested in work that affords them the flexibility to choose their hours and work from home.”
- Connecting worker tasks with end-goal AI use-cases can aid workforce performance.
CloudFactory embeds in their management philosophy the link between worker performance and connection to the final end-goal of the AI system. In an interview with Infinia ML, CloudFactory’s Vice President of Client Success said, “We see how important the role humans take in the overall solutioning is, and when you’ve got humans involved in a process you need to actually think about human psychology and behavior. So we’ve seen success the more you connect the individual people doing the work to the bigger picture. For example, if you’re drawing a bounding box, it’s really important to draw that tightly, because if you’re training an autonomous vehicle, that is going to make the car safer.”
Welocalize’s Ho told Slator that data service companies are still actively working on these types of logistical and management challenges, which characterize the industry. “It’s still a challenge. How do we deliver data services at a business speed, scale, efficiency, and consistency, to harness the power of humans across 50-plus languages, humans who could be handling tens of thousands of tasks on any given day? That’s a complex logistical problem to solve at scale,” she said.
For a deep-dive into this highly attractive niche, download Slator’s full, 44-page report on the data-for-AI market. The report guides LSPs on how to enter and scale in the fast-growing market of creating, collecting, and annotating data for artificial intelligence applications. It features five case studies of LSPs that serve data-for-AI customers, a description of relevant services, and more.