Menu

Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

The News: On September 11, Intel announced the results of the Gaudi2 Intel chip tested in the MLCommons MLPerf Inference performance benchmark for GPT-J. GPT-J is an open source AI model from Eleuther AI, developed as an alternative to OpenAI’s GPT-3. The MLPerf tests are the most widely used and recognized machine learning (ML) benchmark tests.

Here are some of the pertinent details of Gaudi2’s performance:

  • Gaudi2 inference performance on GPT-J-99 and GPT-J-99.9 for server queries and offline samples are 78.58 per second and 84.08 per second, respectively.
  • Gaudi2 delivers compelling performance versus NVIDIA’s H100, with H100 showing a slight advantage of 1.09x (server) and 1.28x (offline) performance relative to Gaudi2.
  • Gaudi2 outperforms NVIDIA’s A100 by 2.4x (server) and 2x (offline).
  • The Gaudi2 submission employed FP8 and reached 99.9% accuracy on this new data type.

Read the full blog post on the Gaudi2 MLPerf GPT-J performance test on the Intel website.

Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

Analyst Take: Current AI workloads are big and expensive, putting pressure on chipmakers and cloud providers to find cheaper, better faster ways to enable AI. A market ecosystem is emerging to address this challenge, from AI-specific-designed chips such as Tensor Processing Units (TPUs), Language Processing Units (LPUs), Neural Processing Units (NPUs), and reimagined central processing units (CPUs).

Business leaders understandably worry about AI compute costs and struggle with ROI, profit margins, and similar considerations. AI compute costs are completely nebulous right now for two reasons: the technology is still experimental and being refined (so that it will scale) and a slew of players want to handle enterprise AI compute. Determining where this market goes and how it ends up is tricky.

Intel is the best-known chipmaker in the world. However, CPUs, which are the chips Intel is so masterful with, have not been the chips that run AI workloads, so far. Intel has been investing heavily to develop chips that will run AI workloads more efficiently, and the Gaudi line is Intel’s cornerstone play. Intel will have an impact on AI compute economics. Here is how.

More Efficient, Cheaper Compute

While the world marvels at what generative AI can do, CIOs and other IT leaders are scrambling to find reasonable ways to run the massive AI compute workloads required for AI output. Most experts believe AI workloads will shift over time from the majority being AI training, which to date has required graphics processing units (GPUs), to the majority being AI inference, which is more efficiently handled by CPUs and other non-GPU chips. If the shift comes to fruition, it bodes well for the Intel Gaudi line – the performance level of these early Gaudi chips is very close to the NVIDIA H100 and A100 options – but maybe more importantly, the Gaudi chips are less expensive and run more efficiently than do the NVIDIA GPUs. This price-performance balance is the Intel strategy.

However, Intel is challenged as are all chipmakers these days in terms of meeting demand. Intel executives told The Futurum Group and a few other analysts that they have availability of Gaudi2 chips through their OEM partners and they are ramping up production for Gaudi2 chips, but they do feel the crunch of accelerated demand.

Perhaps what might be more intriguing is the potential of the next-generation Gaudi chip, Gaudi3, which is slated to debut some time next year. Intel believes this latest chip will improve further on AI inference performance and enable Intel to be even more competitive in price-performance comparisons to GPU options.

In the near term, enterprises will likely invest in a range of AI compute options because there is not a proven path. AI compute must become more efficient and cheaper for AI applications to flourish.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

Google Cloud’s TPU v5e Accelerates the AI Compute War

Groq Ushers In a New AI Compute Paradigm: The Language Processing Unit

AMD and Hugging Face Team Up to Democratize AI Compute – Shrewd Alliance Could Lead to AI Compute Competition, Lower AI Cost

Author Information

Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.

Related Insights
Meta Q4 FY 2025 Results Underscore AI-Fueled Ads Momentum
January 30, 2026

Meta Q4 FY 2025 Results Underscore AI-Fueled Ads Momentum

Futurum Research analyzes Meta’s Q4 FY 2025 earnings, focusing on AI-driven ads gains, stronger Reels and Threads engagement, and how 2026 infrastructure spend and messaging commerce shape enterprise AI strategy....
IBM Q4 FY 2025 Software and Z Cycle Lift Growth and FCF
January 30, 2026

IBM Q4 FY 2025: Software and Z Cycle Lift Growth and FCF

Futurum Research analyzes IBM’s Q4 FY 2025, highlighting software acceleration, the IBM Z AI cycle, and AI-driven productivity and M&A synergies supporting margin expansion and higher FY 2026 free cash...
ServiceNow Q4 FY 2025 Earnings Highlight AI Platform Momentum
January 30, 2026

ServiceNow Q4 FY 2025 Earnings Highlight AI Platform Momentum

Futurum Research analyzes ServiceNow’s Q4 FY 2025 results, highlighting AI agent monetization, platform consolidation in CRM/CPQ, and a security stack aimed at scaling agentic AI across governed workflows heading into...
Microsoft Q2 FY 2026 Cloud Surpasses $50B; Azure Up 38% CC
January 30, 2026

Microsoft Q2 FY 2026: Cloud Surpasses $50B; Azure Up 38% CC

Futurum Research analyzes Microsoft’s Q2 FY 2026 earnings, highlighting AI-led cloud demand, agent platform traction, and Copilot adoption amid record capex and a substantially expanded commercial backlog....
Will Acrobat Studio’s Update Redefine Productivity and Content Creation
January 29, 2026

Will Acrobat Studio’s Update Redefine Productivity and Content Creation?

Keith Kirkpatrick, VP and Research Director at Futurum, covers Adobe’s Acrobat Studio updates and provides his assessment of how this will impact the use of software to manage and automate...
Teradata Set to Turn Data Gravity Into AI Gold With Enterprise AgentStack
January 29, 2026

Teradata Set to Turn Data Gravity Into AI Gold With Enterprise AgentStack

Brad Shimmin, Vice President and Practice Lead at Futurum, analyzes Teradata’s launch of Enterprise AgentStack. He explores how Teradata is leveraging data gravity and robust governance to bridge the "production...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.