Groq’s New Type of AI Provides Answers Almost Instantly

Written by Mike Kaput | Feb 27, 2024 3:01:09 PM

An AI startup named Groq (with a Q) is going viral after showing off an AI system with breathtaking speed.

It all started when Matt Shumer at AI company HyperWrite posted on X about the tool.

Wild tech you have to try: https://t.co/IddQqtQnvV

They are serving Mixtral at nearly 500 tok/s.

Answers are pretty much instantaneous.

Opens up new use-cases, and completely changes the UX possibilities of existing ones.
— Matt Shumer (@mattshumer_) February 18, 2024

Our tests confirm:

It's insanely fast. Like “near-instantaneous answers” fast. In fact, Groq massively outperforms ChatGPT, and one report suggests it's 13 times faster than OpenAI’s popular chatbot.

What's going on here? How on Earth did a little-known startup just shake up what we thought was possible in AI?

I got the answer on Episode 85 of The Artificial Intelligence Show from Marketing AI Institute founder/CEO Paul Roetzer.

A new type of AI chip

The reason Groq is so fast is because it uses a new kind of chip the company developed.

These chips are called Language Processing Units (LPUs). LPUs are built from the ground up for AI. (Unlike the GPUs that NVIDIA sells, which were initially for graphics-heavy applications like video games.)

These chips run popular models like Meta's Llama 2 or Mixtral, and then use their unique design to deliver almost instantaneous results.

The reason everyone's so excited?

That type of speed opens up a whole new world of AI use cases in businesses and consumer-facing services.

Speed is everything when building consumer-facing AI applications. Even the smallest delay in answers from a large language model (LLM) can impact the usability of AI tools in commercial applications.

We already know we can deliver solid results with LLMs if they're tuned properly. But, until now, we haven't had the ability to deliver those results with the speed required for using LLMs as widely as possible.

Now, that might be changing.

Groq vs. NVIDIA

Make no mistake: Groq's chips matter. But it's not knocking NVIDIA off its perch as the dominant chip-maker in AI anytime soon, says Roetzer.

While customers may prefer the speed of Groq's chips over NVIDIA's, Groq is still a minnow compared to NVIDIA's whale.

Groq is on track to deploy 42,000 chips this year and claim they will deploy 1 million by 2025. NVIDIA, in contrast, aims to produce 2 million chips in 2024 alone.

"It's not like all of a sudden they're going to show up and just take all the market share," says Roetzer. "But it is very much an amazing phase of innovation in AI, where no business seems safe."

Even the minnows are giving the whales a run for their money.

The future of business is AI, or obsolete

We’ll see how Groq ends up shaping the future of AI. But the real takeaway here is bigger than just faster LLMs, says Roetzer.

You need to understand that the future of every business is AI, or obsolete.

Moving forward, every single business on the planet will fall into one of three categories:

AI Native. Companies built from scratch with AI at the core of the product / service, and likely deeply integrated into marketing, sales, service, and operations.
AI Emergent. Established organizations that move quickly to adopt and scale AI across all areas of the organization.
Obsolete. Companies that wait for the business world to get smarter around them, and resist AI-driven change. These companies eventually lose relevance and fade away.

Becoming AI Native or AI Emergent first requires that you understand that every business in every industry faces the opportunity to disrupt and has the risk of being disrupted, says Roetzer.

"I don't care what company you're in and what industry it is. You have to assume someone is going to build a smarter version of your company. And it's way better to be the one that does that yourself.”

View full post