AI Video Startup Captions Valued at USD 500M in USD 60M Series C

AI Video Startup Captions Valued at USD 500M in USD 60M Series C

AI video editing startup Captions announced on July 9, 2024, that it had raised USD 60m in a Series C funding round, bringing its total funds raised thus far to USD 100m. The company’s valuation was also pegged at USD 500m.

The lead investor was Index Ventures, which also backed DeepL’s most recent funding round, which raised USD 300m and valued the company at USD 2bn.

In a tweet announcing the news, Captions thanked returning investors Kleiner Perkins, Andreessen Horowitz, and Sequoia Capital, and new investors Adobe Ventures, HubSpot Ventures, and actor Jared Leto.

According to Captions, over the past year, the company has seen its New York-headquartered team grow from 15 members to 60; customers have created over 3m videos monthly; and the app currently has more than 10m global creators.

The Captions platform allows users to create, edit, and distribute videos  — more specifically, it lowers the barriers to video creation for non-experts. The app is available in desktop, web, and mobile formats, and has two pricing levels, Pro and Max.

The startup’s specialty is videos featuring people speaking. Users can type text, which can then be spoken by a photorealistic avatar; the software can even create a script if a user just suggests a topic. Here, Captions competes directly with UK-based Synthesia, which in mid-2023 raised a USD 90m Series C at a unicorn valuation and has deployed the funds to engineer a full platform relaunch announced in June 2024.

Captions provides translations for subtitles and dubs in 28 languages and uses AI to make translated voices sound like the original. Among the supported languages is Hinglish, a mix of Hindi and English often used by young people in India, an indication of the rising importance of India’s creator scene.

Captions openly integrates other large language models (LLMs) in its workflow. OpenAI and Anthropic, for example, are used for text generation; software from voice cloning startup ElevenLabs brings voices to life.

Co-founders Gaurav Misra and Dwight Churchill established Captions in 2021. Now-CEO Misra previously spent five years as Head of Design Engineering for Snap Inc., the tech company formerly known as Snapchat, and was a software development engineer at Microsoft during the launch of Azure.

Churchill, Captions’ COO, is also a developer. His experience includes roles at Goldman Sachs and, more recently, Klaviyo.

While a number of the prominent investors in this funding round are based in Silicon Valley, Captions has publicized its intentions to stay in New York.

“Looking to the future, we’re excited to share our plans to invest $100 million into advancing generative video research, here in New York City,” Misra wrote in a blog post. “We believe that New York is emerging as the epicenter for AI research and look forward to building our world-class team here.”

At the time of publication, Captions currently has openings for two product designers; seven software engineers; two performance marketers; two “technical sourcers”; plus an applied scientist, android engineer, and machine learning scientist.