Intel Gaudi Performance – Beats NVIDIA?

Intel Gaudi Performance - Beats NVIDIA?

The Six Five team discusses Intel Gaudi Performance – Beats NVIDIA?

If you are interested in watching the full episode you can check it out here.

Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.

Transcript:

Daniel Newman: All right man. Listen, I tweeted something out the other night. I think this is probably where this topic came from, and I kind of said something along the lines of, “We don’t talk about Gaudi enough.” The last several months there’s been kind of this weird gap that’s been created. We talk about NVIDIA, H100 now the B Series and the Grace Blackwell, and then we talk about homegrown silicon being provided by the cloud providers and it’s NVIDIA AMD. It’s like NVIDIA has AMDs, they’re looking at them and that’s the competition. And then over here we’re looking at… But we do talk a lot about accelerators. You actually had a great tweet this week about ASICs and the need to create standards so that we can scale the development in that particular area, Pat. But one of the things that we haven’t talked a lot about is Intel and whether or not… I know we talk about 2025 and their potential GPU, but Pat, we’ve talked a lot on the show about how ASICs and even the XPU can be very competitive in certain cases to NVIDIA.

And this week Intel put out a newsroom post, this probably isn’t like a 20 minute discussion, but it’s a few minutes here. And they basically talked about the MLCommons putting new results industry standard MLPerf benchmark for inference. And it basically noted that the Gaudi2 and fifth gen Intel with their AMX, which is the accelerated extensions essentially can be a very good alternative to H100s for generative performance as it relates to inference, Pat. And I guess I just was thinking to myself when I’m looking at this is, “Gosh, why does nobody talk about Intel? Why is Intel being written off?

Now, of course, I can give you a quick argument of that because they haven’t talked about enough big cloud wins yet. And I think the fact that we’ve heard about Gaudi, we’re seeing its performance. By the way, this is a really strong performance with their Gaudi2 and guess what’s coming?

Patrick Moorhead: Oh gosh, Gaudi3.

Daniel Newman: Gaudi3.5. No, I’m kidding. That’s GPT. Gaudi3. So the point is, with their almost last generation, you know how we love to do the generations thing, Pat, we love to talk about, “Well gosh, NVIDIA’s chip that isn’t even shipping yet is kicking AMD’s butt. Well, hold on a second. H100s were outperformed in many ways by the new AMDs. And now yes, NVIDIA’s answered that with a product that’s going to ship in the future, but same thing here. So now we have an Intel product that’s coming that’s more performance in certain inference cases than the NVIDIA chip. Now, said that Pat, you and I think have to be very, very clear because we know a lot of people in the chip space listen to us, this is not a GPU, it does not have flexibility and programmability like a GPU, but in cases where inference in language is super important, this is a really efficient performance alternative with strong specs, strong metrics, and they talked about it on LLaMA, on Stable Diffusion, on Hugging Face text generation, so on a number of different workloads, this particular chip performed.

So the moral of my story is the world loves to write off intel, and I’m sure Pat Gelsinger loves what he calls the permabears. I just think between now the Gaudi3 and then ’25 when they start to deliver their GPUs, if there really is a 250 and upwards of potentially $400 billion TAM for GPUs over the next four years, five years, is what we’re hearing, I think there’s a real shot. Intel is going to get a piece of that business. And I know I’m a little too positive on Intel, I hear it sometimes from people. But people like to always tell me why they’re right and I like to mark it as this date, 3/29/2024 when I told them I think they’re wrong.

Patrick Moorhead: Wow, you left me a little oxygen. Let me take a little bit of a difference. So first off, the claim was not that it was better performance with Gaudi2, it was that it was best price performance and it’s 40% more. And when I stand back and say, “Hey, would I shift for 40%?” I probably wouldn’t if I needed three years of different types of models, but if it’s a steady state workload, 40% is a ton. The one thing that got a little bit buried in the lead was that Intel Xeon was the only processor tested or SOC tested with and like you said, AMX extensions and think of AMX as a little accelerator that sits on the Xeon SOC. And I think that’s a major accomplishment in that we didn’t see anything from AMD. Now, a MD does not have acceleration capability like AMX. It does have a massive FPU, and then a massive matrix engine that’s leveraged by SSE2, but that’s very different and less efficient for many workloads compared to AMX.

Dan, we have debated on this show that if only two people showed up for a gunfight, was there really a gunfight? And one thing I did appreciate from MLCommons, this is David Cantorc xxd, we’ve all been on briefing calls together and he said, “Submitting to MLPerf is quite challenging and a real accomplishment Due to the complex nature of ML workloads, each submitter must ensure that both their hardware and software stacks are capable, stable, performant for wanting these types of ML workloads.” And that message was directed at Dell, Fujitsu, NVIDIA and Qualcomm that submitted data center focused power numbers, but those power numbers had to be run while you’re running the ML inference out there.

So I think first of all, it’s good to acknowledge why others weren’t on there, but I still kind of question that if you only have two people show up for a certain benchmark, what’s the value of that? So, I mean we’ve already debated that I think on these MLCommons benchmarks, but I think it is a reflection of the difficulty of AI in totality. So Dan, let’s move to the next topic.

Daniel Newman: Can I say one thing?

Patrick Moorhead: Please.

Daniel Newman: I’m glad you called it out. I want to make sure I’m correct when I said it, I said on par, not equal, but I said that, and I believe it’s A100s, that it actually outperformed H100s that it was near par. So I should say, “Near par” not, “Outperform.” If I said outperform, I was wrong. I’m correcting myself.

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

Related Insights
Is AI Ready for Real Work, or Are Enterprises Still Stuck in Experimentation?
July 4, 2026

Is AI Ready for Real Work, or Are Enterprises Still Stuck in Experimentation?

Most enterprises claim advanced AI maturity, but lack governance and deployment strategies. Leading organizations are moving from experimentation to measurable AI impact....
Compliance as Code Is No Longer Optional: Why Manual Reviews Can’t Keep Up
July 4, 2026

Compliance as Code Is No Longer Optional: Why Manual Reviews Can’t Keep Up

Qodo's 'Compliance as Code' framework automates enterprise AI compliance through PR checks, solving the data privacy and security gaps that plague manual reviews at scale....
Databricks AI’s GPU Reliability Push Exposes Hidden Risks for Large-Scale Training
July 3, 2026

Databricks AI’s GPU Reliability Push Exposes Hidden Risks for Large-Scale Training

Databricks AI reveals critical GPU reliability challenges in distributed training environments. Silent slowdowns and numerical corruption pose greater risks than visible failures, threatening model quality and compute efficiency at enterprise...
AI Code Review Hits a Wall: Why Speed Without Trust Risks Engineering Chaos
July 3, 2026

AI Code Review Hits a Wall: Why Speed Without Trust Risks Engineering Chaos

A survey shows 94% of engineering leaders use agentic AI coding tools, but 55% struggle with reliability and hallucinations—revealing a critical gap between development speed and production quality....
Brave's Browser Containers Raise the Bar for Privacy and Workflow Flexibility
July 3, 2026

Brave’s Browser Containers Raise the Bar for Privacy and Workflow Flexibility

As AI platform adoption accelerates to $181.3B projected market size, Brave's v1.92 release introduces native browser containers addressing data privacy concerns for 52.6% of enterprise decision makers managing multi-cloud AI...
Is Self-Healing ITOps Ready to Replace Manual Incident Response?
July 3, 2026

Is Self-Healing ITOps Ready to Replace Manual Incident Response?

LogicMonitor's AI-driven ITOps framework combines root-cause analysis with governed automation to reduce alert fatigue and accelerate issue resolution, as agentic AI reshapes enterprise infrastructure management....

Book a Demo

Welcome

The vision behind everything in Futurum’s Custom Research practice is this: research should show you what is happening, what comes next, and what to do about it. It should be personal to each audience, easy for people to grasp, and structured so LLMs can reason over it accurately. And it should be fast and turnkey; you want answers now, not another project to carry for quarters.

Whether you are defining business, channel, or go-to-market strategy; evaluating vendors or justifying ROI; or commissioning research to fill an emerging market need, we have your back, with a program that answers your questions with the objectivity and credibility to drive real decisions.

To do it, we bring unmatched data to bear: Futurum research, surveys, and market projections; validated market feeds; ETR’s 15 years of insight from 10,000 technology decision-makers; G2’s buyer and user data; and what our analysts hear every day. Add leading primary collection, from AI-moderated voice interviews to surveys and analyst-led interviews, all turnkey, and every project comes out credible, nuanced, and actionable.

And we don’t just drop the results in your lap. For internal work, we provide analyst-led sessions, interactive dashboards, and a range of formats. For market-facing work, Futurum delivers turnkey activation and amplification that actually gets seen, by people and by LLMs, through our media and share of voice. This is research that moves decisions and markets.

We will meet you wherever you are, from a fast-turn brief to a multi-year program, and shape the work to your goals, timeline, and budget. The right program for your moment.

If any of this is useful, I would love to talk.

Benjamin Brown, VP Custom Research, Futurum Research

Benjamin Brown

VP, Custom Research · The Futurum Group

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.