Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

The News: On September 11, Intel announced the results of the Gaudi2 Intel chip tested in the MLCommons MLPerf Inference performance benchmark for GPT-J. GPT-J is an open source AI model from Eleuther AI, developed as an alternative to OpenAI’s GPT-3. The MLPerf tests are the most widely used and recognized machine learning (ML) benchmark tests.

Here are some of the pertinent details of Gaudi2’s performance:

  • Gaudi2 inference performance on GPT-J-99 and GPT-J-99.9 for server queries and offline samples are 78.58 per second and 84.08 per second, respectively.
  • Gaudi2 delivers compelling performance versus NVIDIA’s H100, with H100 showing a slight advantage of 1.09x (server) and 1.28x (offline) performance relative to Gaudi2.
  • Gaudi2 outperforms NVIDIA’s A100 by 2.4x (server) and 2x (offline).
  • The Gaudi2 submission employed FP8 and reached 99.9% accuracy on this new data type.

Read the full blog post on the Gaudi2 MLPerf GPT-J performance test on the Intel website.

Intel Gaudi2: A CPU Alternative to GPUs in the AI War?

Analyst Take: Current AI workloads are big and expensive, putting pressure on chipmakers and cloud providers to find cheaper, better faster ways to enable AI. A market ecosystem is emerging to address this challenge, from AI-specific-designed chips such as Tensor Processing Units (TPUs), Language Processing Units (LPUs), Neural Processing Units (NPUs), and reimagined central processing units (CPUs).

Business leaders understandably worry about AI compute costs and struggle with ROI, profit margins, and similar considerations. AI compute costs are completely nebulous right now for two reasons: the technology is still experimental and being refined (so that it will scale) and a slew of players want to handle enterprise AI compute. Determining where this market goes and how it ends up is tricky.

Intel is the best-known chipmaker in the world. However, CPUs, which are the chips Intel is so masterful with, have not been the chips that run AI workloads, so far. Intel has been investing heavily to develop chips that will run AI workloads more efficiently, and the Gaudi line is Intel’s cornerstone play. Intel will have an impact on AI compute economics. Here is how.

More Efficient, Cheaper Compute

While the world marvels at what generative AI can do, CIOs and other IT leaders are scrambling to find reasonable ways to run the massive AI compute workloads required for AI output. Most experts believe AI workloads will shift over time from the majority being AI training, which to date has required graphics processing units (GPUs), to the majority being AI inference, which is more efficiently handled by CPUs and other non-GPU chips. If the shift comes to fruition, it bodes well for the Intel Gaudi line – the performance level of these early Gaudi chips is very close to the NVIDIA H100 and A100 options – but maybe more importantly, the Gaudi chips are less expensive and run more efficiently than do the NVIDIA GPUs. This price-performance balance is the Intel strategy.

However, Intel is challenged as are all chipmakers these days in terms of meeting demand. Intel executives told The Futurum Group and a few other analysts that they have availability of Gaudi2 chips through their OEM partners and they are ramping up production for Gaudi2 chips, but they do feel the crunch of accelerated demand.

Perhaps what might be more intriguing is the potential of the next-generation Gaudi chip, Gaudi3, which is slated to debut some time next year. Intel believes this latest chip will improve further on AI inference performance and enable Intel to be even more competitive in price-performance comparisons to GPU options.

In the near term, enterprises will likely invest in a range of AI compute options because there is not a proven path. AI compute must become more efficient and cheaper for AI applications to flourish.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

Google Cloud’s TPU v5e Accelerates the AI Compute War

Groq Ushers In a New AI Compute Paradigm: The Language Processing Unit

AMD and Hugging Face Team Up to Democratize AI Compute – Shrewd Alliance Could Lead to AI Compute Competition, Lower AI Cost

Author Information

Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.

Related Insights
Will Edison International’s Board Refresh Accelerate Its AI and Digital Ambitions?
April 25, 2026

Will Edison International’s Board Refresh Accelerate Its AI and Digital Ambitions?

Edison International appoints M. Susan Hardwick as independent director, strengthening the utility's leadership as it confronts mounting pressure to modernize operations and leverage AI-driven infrastructure solutions....
Will GPT-5.5 Redefine Enterprise AI, or Hit the Limits of Trust and Control?
April 25, 2026

Will GPT-5.5 Redefine Enterprise AI, or Hit the Limits of Trust and Control?

OpenAI's GPT-5.5 launches as a transformative enterprise AI platform, yet adoption barriers around trust, reliability, and data privacy remain critical concerns for 78% of organizations planning AI budget increases....
GPT-5.5 Raises the Stakes: Can OpenAI Maintain Its Lead as Enterprise AI Matures?
April 25, 2026

GPT-5.5 Raises the Stakes: Can OpenAI Maintain Its Lead as Enterprise AI Matures?

OpenAI's GPT-5.5 launch marks a critical moment in enterprise AI adoption. With 68% of organizations at advanced GenAI stages, competition from Microsoft and Google intensifies as buyers prioritize reliability and...
Can IBM's RITS Platform and vLLM Reset the Bar for Enterprise AI Access?
April 25, 2026

Can IBM’s RITS Platform and vLLM Reset the Bar for Enterprise AI Access?

IBM Research's RITS Platform uses vLLM to centralize large language model access across enterprise teams, signaling a shift toward scalable, governed AI infrastructure that balances innovation, cost, and control....
Autonomous Enterprise
April 24, 2026

Will ServiceNow and Google Cloud’s AI Agent Alliance Disrupt the Autonomous Enterprise Race?

ServiceNow and Google Cloud partnered to deliver AI agent solutions for autonomous enterprise operations, targeting 5G, retail, and IT sectors while raising concerns about vendor lock-in and scalability....
Google's $750M Partner Bet Resets the Agentic Channel Playbook
April 24, 2026

Google’s $750M Partner Bet Resets the Agentic Channel Playbook

Tiffani Bova at Futurum examines Google's $750M agentic AI partner commitment and new alliance formations with Accenture, Deloitte, Salesforce, and Vista Equity that reset channel program expectations....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.