Are Groq Language Processing Units the Fastest Option for Large Language Models?

Are Groq Language Processing Units the Fastest Option for Large Language Models

Users of large language models (LLMs) want accurate output — and they want it fast. Low latency (i.e., minimal wait times for responses) is a major factor in user satisfaction and adoption. 

Naturally, it follows that the companies whose technology powers LLMs will aim for a reputation as the fastest around. 

And as of late, Groq seems to be winning the race for this title. The company bills itself as building “the world’s fastest AI inference technology.” What is behind the claim?

Groq has been around since 2016 but burst onto the scene in 2024 thanks to its language processing units (LPUs). 

LPUs are the latest in a succession of increasingly specialized chips that power LLMs. Central processing units (CPUs), the so-called “workhorse of computing,” failed at certain tasks. 

Graphics processing units (GPUs), developed by Groq competitor NVIDIA, are better suited for handling multiple tasks, such as producing images for gaming. 

Groq’s LPUs are smaller and less expensive than GPUs, so Groq promotes them as optimal for powering LLMs.

In fact, in November 2023, Groq reported that it had set a new performance standard of more than 300 tokens per second per user on Meta AI’s LLM, Llama-2 70B., run on Groq’s LPU system. 

Beyond the marketing campaigns, the question remains: Is Groq really the fastest option for LLMs? Even though the November 2023 test was internal, it named a specific benchmark for competitors to either meet or surpass. Industry professionals outside of Groq have also chimed in.

Corporate communications expert Lulu Cheng Meservey observed on X, “Some people will judge Groq on the quality of the output, so the team should keep reminding people that the point isn’t to compare [Mistral] to Llama or whatever, the point is that each model runs faster on Groq than on other chips.”

Robert Scoble, a Silicon Valley veteran and former strategist for Microsoft, chose this headline for his May 2024 interview with Groq’s CEO: “The AI company that makes your AI run faster than OpenAI’s best.”

For now, Groq’s claim to fame as the fastest option for LLMs seems secure — but AI technology moves fast (pun intended), and Groq may soon need to defend its title against competitors, predecessors, and newcomers alike.