Menu

Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

The News: On September 11, Intel announced the results of the Gaudi2 Intel chip tested in the MLCommons MLPerf Inference performance benchmark for GPT-J. GPT-J is an open source AI model from Eleuther AI, developed as an alternative to OpenAI’s GPT-3. The MLPerf tests are the most widely used and recognized machine learning (ML) benchmark tests.

Here are some of the pertinent details of Gaudi2’s performance:

  • Gaudi2 inference performance on GPT-J-99 and GPT-J-99.9 for server queries and offline samples are 78.58 per second and 84.08 per second, respectively.
  • Gaudi2 delivers compelling performance versus NVIDIA’s H100, with H100 showing a slight advantage of 1.09x (server) and 1.28x (offline) performance relative to Gaudi2.
  • Gaudi2 outperforms NVIDIA’s A100 by 2.4x (server) and 2x (offline).
  • The Gaudi2 submission employed FP8 and reached 99.9% accuracy on this new data type.

Read the full blog post on the Gaudi2 MLPerf GPT-J performance test on the Intel website.

Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

Analyst Take: Current AI workloads are big and expensive, putting pressure on chipmakers and cloud providers to find cheaper, better faster ways to enable AI. A market ecosystem is emerging to address this challenge, from AI-specific-designed chips such as Tensor Processing Units (TPUs), Language Processing Units (LPUs), Neural Processing Units (NPUs), and reimagined central processing units (CPUs).

Business leaders understandably worry about AI compute costs and struggle with ROI, profit margins, and similar considerations. AI compute costs are completely nebulous right now for two reasons: the technology is still experimental and being refined (so that it will scale) and a slew of players want to handle enterprise AI compute. Determining where this market goes and how it ends up is tricky.

Intel is the best-known chipmaker in the world. However, CPUs, which are the chips Intel is so masterful with, have not been the chips that run AI workloads, so far. Intel has been investing heavily to develop chips that will run AI workloads more efficiently, and the Gaudi line is Intel’s cornerstone play. Intel will have an impact on AI compute economics. Here is how.

More Efficient, Cheaper Compute

While the world marvels at what generative AI can do, CIOs and other IT leaders are scrambling to find reasonable ways to run the massive AI compute workloads required for AI output. Most experts believe AI workloads will shift over time from the majority being AI training, which to date has required graphics processing units (GPUs), to the majority being AI inference, which is more efficiently handled by CPUs and other non-GPU chips. If the shift comes to fruition, it bodes well for the Intel Gaudi line – the performance level of these early Gaudi chips is very close to the NVIDIA H100 and A100 options – but maybe more importantly, the Gaudi chips are less expensive and run more efficiently than do the NVIDIA GPUs. This price-performance balance is the Intel strategy.

However, Intel is challenged as are all chipmakers these days in terms of meeting demand. Intel executives told The Futurum Group and a few other analysts that they have availability of Gaudi2 chips through their OEM partners and they are ramping up production for Gaudi2 chips, but they do feel the crunch of accelerated demand.

Perhaps what might be more intriguing is the potential of the next-generation Gaudi chip, Gaudi3, which is slated to debut some time next year. Intel believes this latest chip will improve further on AI inference performance and enable Intel to be even more competitive in price-performance comparisons to GPU options.

In the near term, enterprises will likely invest in a range of AI compute options because there is not a proven path. AI compute must become more efficient and cheaper for AI applications to flourish.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

Google Cloud’s TPU v5e Accelerates the AI Compute War

Groq Ushers In a New AI Compute Paradigm: The Language Processing Unit

AMD and Hugging Face Team Up to Democratize AI Compute – Shrewd Alliance Could Lead to AI Compute Competition, Lower AI Cost

Author Information

Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.

Related Insights
CrowdStrike Q4 FY 2026 Earnings Extend ARR Scale and AI Security Focus
March 6, 2026

CrowdStrike Q4 FY 2026 Earnings Extend ARR Scale and AI Security Focus

Fernando Montenegro, VP Cybersecurity at Futurum, highlights CrowdStrike’s Q4 FY26 earnings: Falcon expands into AI security, identity, and browser runtime, underscoring consolidation-driven cybersecurity strategies....
S3NS & Sovereignty Can Thales-Google Venture Make AI Sovereignty Work at Scale
March 5, 2026

S3NS & Sovereignty: Can Thales-Google Venture Make AI Sovereignty Work at Scale?

Nick Patience, VP & Practice Lead for AI Platforms at Futurum Research, assesses S3NS’s progress following its SecNumCloud qualification, evaluates the sovereign AI roadmap, and examines what the Thales-Google Cloud...
Could Apple’s New $599 MacBook Neo Decimate The Mid-Range Windows Laptop Market
March 5, 2026

Could Apple’s New $599 MacBook Neo Decimate The Mid-Range Windows Laptop Market?

Olivier Blanchard, Analyst at Futurum, shares his insights on Apple's new $599 MacBook Neo. This breakthrough price point is set to disrupt the entire budget PC market and could be...
Elastic Q3 FY 2026 Strong Quarter, but Reacceleration Thesis Unproven
March 3, 2026

Elastic Q3 FY 2026: Strong Quarter, but Reacceleration Thesis Unproven

Nick Patience, VP and Practice Lead for AI Platforms at Futurum reviews Elastic Q3 FY 2026 earnings, highlighting sales-led subscription momentum, AI context engineering adoption, and agentic workflow expansion across...
CoreWeave Q4 FY 2025 Results Highlight Backlog Growth And Capacity Expansion
March 3, 2026

CoreWeave Q4 FY 2025 Results Highlight Backlog Growth And Capacity Expansion

Futurum Research reviews CoreWeave’s Q4 FY 2025 earnings, focusing on backlog-driven capacity expansion, platform monetization beyond GPUs, and execution cadence shaping AI infrastructure supply....
Snowflake Q4 FY 2026 Results Highlight AI-Led Consumption and Platform Expansion
March 2, 2026

Snowflake Q4 FY 2026 Results Highlight AI-Led Consumption and Platform Expansion

Brad Shimmin, Vice President & Practice Lead at Futurum analyzes Snowflake’s Q4 FY 2026 earnings, highlighting AI-driven consumption growth, expanding platform scope, and guidance shaping expectations for FY 2027....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.