Nvidia (NASDAQ:NVDA | NVDA Price Prediction) may be about to remind the AI world who still runs the show. Reports circulating this week suggest the company is preparing a new chip specifically targeting the inference market, and Jim Cramer took notice on his program.
Here’s what Cramer said last night:
“Nvidia is about to announce a new chip that can compete with pesky offerings from competitors. Several companies, including long standing Nvidia customers, have been bragging about how they make a particular kind of AI semiconductor ones for inference. And Nvidia, well, it looks like they may have something against that that could be better.”
Nvidia shares rallied nearly 3% on the news. They’ve given up part of those gains today, but that’s amid a broad market sell-off that saw the Dow down more than 1,000 points in early trading. Let’s look at what NVIDIA’s new inference chip is and how it could change the battle between the company and rivals like Broadcom (Nasdaq: AVGO) and Alphabet (Nasdaq: GOOG)(Nasdaq: GOOLG).
Why Inference Is the New Battleground
Training is the expensive, power-hungry process of building an AI model. Inference is what happens every time someone uses that model: every query, every response, every decision. As AI moves from labs into products, inference volume explodes.
Nvidia has long dominated AI training hardware, but inference favors efficiency over raw throughput, which is exactly why competitors have found an opening. Companies like Broadcom have argued that NVIDIA’s GPUs aren’t specialized enough for inference and will soon prove to be too expensive.
Recently, NVIDIA purchased the IP and most the employees from startup Groq.
The specifics of Groq’s technology are fairly technical, but there’s the key idea. Groq has been working on an entirely different architecture than NVIDIA’s GPUs.
In short, it utilizes a compiler that pre-plans operations. So instead of needing to coordinate high-bandwidth memory, Groq’s chips execute a schedule using on-chip SRAM.
The downside to this architecture is that pre-compiling is difficult. You need chips to be perfectly synchonized, which is an incredibly difficult engineering challenge. NVIDIA has presented innovations at recent conferences that strongly hint at the company discovering a solution to synchonize Groq’s chips.
So, the market is putting two and two together. NVIDIA shelled out $20 billion for Groq, and plans to utilize its ‘clock-forwarded die-to-die links’ to commercialize Groq. It’s expected NVIDIA could announce this new chip built for inferencing workloads at GTC, which takes place later his month.
What this Means for NVIDIA’s Competitors
I know the above portion was fairly technical, but here’s the key takeaways.
- This isn’t for training: Groq’s architecture isn’t built for training models but rather inferencing. It gives NVIDIA a potential strategic ‘masterstroke’ because it would allow the company to bypass markets like high-bandwidth memory that have severe shortages. In short, if NVIDIA releases a new inference system based on Groq’s technology, it could allow the company to further surpass Wall Street targets in the upcoming years as NVIDIA would avoid supply constrains.
- Blunts competitive threats: As stated earlier, the argument from companies like Broadcom and Marvell has been that NVIDIA chips have too many ‘jack of all trades’ features and aren’t built to be competitive in inferencing over time. They argue custom chips will have better economics. This allows NVIDIA to attack those complaints head on.
We’ll have to see what exactly NVIDIA announces at GTC, but it does seem as though the company may have pulled off another brilliant move purchasing Groq. Right now Jim Cramer is talking about chips from the company, but after GTC expect talk of NVIDIA’s new chips to be all over the financial world.