On June 20, 2023, voice-generating platform ElevenLabs announced it had raised USD 19m in a Series A round. The startup, which launched in beta in late January 2023, offers text-to-speech (TTS) and voice cloning software. Co-founder and CEO Mati Staniszewski told TechCrunch that the funds will be used to continue building a research hub for voice AI and to launch a range of additional, vertical-specific products for the publishing, gaming, and entertainment spheres, among others.
Vertical-specific, synthetic, and multilingual voice generation is becoming a crowded field in language AI, with companies like Voiseed building products for very specific use cases such as gaming.
Nat Friedman, Daniel Gross, and Andreessen Horowitz co-led the funding round, along with participation from Credo Ventures, Concept Ventures, Creator Ventures, and SV Angel.
The roster of individual participants reads like a who’s-who in tech: Instagram co-founder Mike Krieger, Oculus VR co-founder Brendan Iribe, Ubiquity6 co-founder Anjney Midha, Deepmind and Inflection co-founder Mustafa Suleyman, Runway co-founder Siqi Chen, Inkitt co-founder Ali Albazaz, Reface co-founder Dima Shvets, Perplexity AI co-founder Aravind Srinivas, Vercel founder Guillermo Rauch, and O’Reilly Media founder Tim O’Reilly.
The scale of the funds raised is all the more impressive considering Staniszewski and now-CTO Piotr Dabkowski founded ElevenLabs a little over a year ago, in April 2022. Strategic investments from audiobook publisher Storytel and media platform TheSoul Publishing speak to ElevenLabs’ potential use cases.
TechCrunch reported that the company was valued at USD 99m post-money. In addition to the Series A round, ElevenLabs raised USD 2m in a pre-seed funding round led by Credo Ventures, with Concept Ventures and other individual investors also participating, bringing the total raised to date — and in less than six months — to USD 21m
According to ElevenLabs’ origin story, childhood friends Staniszewski, formerly of Palantir, and ex-Googler Dabkowski founded the company to create better voiceovers than those that accompanied the American films they grew up watching in Poland.
Logic and Emotions
“Our Text to Speech model is built to grasp the logic and emotions behind words. Instead of generating sentences one-by-one, it stays mindful of how each fragment ties to the surrounding context,” reads ElevenLabs’ beta webpage. “This zoomed-out perspective allows it to intonate longer fragments convincingly – and it works with any voice.”
On top of its “proprietary deep learning model,” Eleven Labs has built a speech synthesis tool, with which users can convert writing to professional-sounding audio. Eleven Multilingual currently supports English, Spanish, French, Hindi, Italian, German, Polish, and Portuguese, and the website states, somewhat mysteriously, that “new languages are to be added sequentially.”
ElevenLabs describes its other product, VoiceLab, as an “AI creative toolkit.” It includes a generative AI model to create synthetic speech as well as a voice cloning model that can learn “any speech profile from just a minute of audio.”
Later in 2023, ElevenLabs will introduce Publishers Projects, a workstation where users can create and edit long-form spoken content, from dialogue segments to full audiobooks.
Of course, ElevenLabs is far from the only company looking to capitalize on AI-powered voice tech. Competitors include Meta, whose AI branch introduced Voicebox, a new generative AI model for speech, on June 16, 2023.