The News: IBM and Intel have unveiled a partnership to deploy Intel Gaudi 3 AI accelerators as a service on IBM Cloud. This offering, which is expected to be available in early 2025, aims to more cost effectively scale enterprise AI and drive innovation underpinned with security and resiliency. This collaboration can also enable support for Gaudi 3 within IBM’s watsonx AI and data platform. IBM Cloud positions itself as the first cloud service provider to adopt Gaudi 3, and the offering will be available for both hybrid and on-premise environments.
Justin Hotard, Executive Vice President and General Manager, Intel Data Center and AI, observed that unlocking the full potential of AI requires an open and collaborative ecosystem that provides customers with choice and accessible solutions. By integrating Gaudi 3 AI Accelerators and Xeon CPUs with IBM Cloud, Intel and IBM aim to create new AI capabilities and meet the demand for affordable, secure, and innovative AI computing solutions.
Intel and IBM Show Joint AI Moxie in the Cloud
Analyst Take: Intel securing IBM as a significant cloud partner marks a crucial turning point for the company. We’ve emphasized that Intel needed a public cloud victory, and this partnership is a promising beginning. Following its recent Q2 2024 earnings release, Intel fundamentally needed a public cloud win, and this is a positive outcome for the company.
In 2024, Intel’s partners and customers already include equipment manufacturers, database providers, systems integrators, software suppliers, service providers, and other specialists including IBM, NAVER, Bosch, Ola/Krutrim, NielsenIQ, Seekr, IFF, CtrlS Group, Bharti Airtel, Landing AI, Roboflow, and Infosys. IBM is using 5th Gen Intel Xeon processors for its watsonx/data store and coordinated with Intel to validate the watsonx platform for Intel Gaudi accelerators.
Intel has made ecosystem-wide progress promoting its Gaudi 3 accelerator, emphasizing its competitive price/performance ratio, which makes it a strong contender for specific AI use cases against NVIDIA, AMD, and other AI chipset rivals across compute-intensive environments, including especially AI workloads.
The Intel Gaudi 3 AI accelerator is built to power AI systems with up to tens of thousands of accelerators connected through the common standard of Ethernet. Intel Gaudi 3 promises fourfold more AI compute for BF16 and a 1.5x increase in memory bandwidth over its predecessor. Intel Gaudi 3 is designed to offer open, community-based software and industry-standard Ethernet networking.
Intel Gaudi 3: Competitive Edge Factors
In comparison to NVIDIA H100, Intel Gaudi 3 is projected to deliver 50% faster time-to-train on average across Meta’s Llama2 models with 7B (billion) and 13B parameters, and OpenAI’s GPT-3 175B parameter model. Plus, Intel Gaudi 3 accelerator inference throughput is projected to outperform the H100 by 50% on average and 40% for inference power-efficiency averaged across Llama 7B and 70B parameters, and TII’s Falcon 180B parameter models.
Moreover, I find that Intel Gaudi 3 is positioned to compete against NVIDIA’s Blackwell B200 AI accelerator chip while real-world testing of both offerings is pending. Key to differentiation is that Gaudi 3 is constituted of two identical silicon dies joined by a high-bandwidth connection. Each has a principal region of 48 MB of cache memory. Surrounding that are four engines for matrix multiplication and 32 programmable tensor processor cores that are encircled by connections to memory and capped with media processing and network infrastructure at one end. For memory bandwidth, Gaudi 3 uses the less costly high-bandwidth memory HBM2E in contrast to the HBM3/HBM3E implementations being used by rivals.
Where I expect that Intel can make the most gains is in the pricing category as well as related price performance considerations. Across baseboard list price and configured system prices, Intel can prove attractive to enterprises less keen on paying the NVIDIA premium and have calculated that their AI requirements will use right-sized LLMs and training models that meet their specific needs in the medium-to-small range. For example, the Intel Gaudi 3 UBB baseboard is listed at $125,000. In contrast, NVIDIA HGX H100 baseboard is listed at $375,000 and NVIDIA HGX B100 is listed at $470,000, giving Intel a clear price advantage based on this comparison.
Also, the Gaudi 3 power efficiency advantages are critical in meeting customer power challenges throughout data center environments including scenarios where the LLMs are tasked with providing a longer output. To achieve a power efficiency edge, the Gaudi architecture uses large-matrix math engines that are 512 bits across and require less memory bandwidth in relation to competing architectures, such as NVIDIA H100/H200/B200 and AMD Instinct MI300, that rely on smaller engines to execute the same calculations.
The accelerator can deliver sizable improvements in AI training and inference for global enterprises looking to deploy GenAI at scale. This includes enabling enterprises to scale flexibly from a single node to clusters, super-clusters, and mega-clusters with thousands of nodes, supporting inference, fine-tuning, and training at the highest range scale.
Intel and IBM Ready to Ride Ethernet Bandwagon
Through the Ultra Ethernet Consortium (UEC), Intel is cultivating open Ethernet networking for AI fabrics, introducing an array of AI-optimized Ethernet solutions. Designed to transform large scale-up and scale-out AI fabrics, these offerings can enable training and inferencing for increasingly vast models, with sizes expanding by an order of magnitude in each generation. The lineup includes the Intel AI NIC, AI connectivity chiplets for integration into XPUs, Gaudi-based systems, and a range of soft and hard reference AI interconnect designs for Intel Foundry.
I anticipate that upcoming UEC specs can help assure the Ethernet standard and technology will be ready to host and scale massive AI workloads, providing enduring competitive advantages over proprietary Nvidia-backed InfiniBand implementations. In accord with Ethernet-related breakthroughs, such as RDMA over Converged Ethernet (RoCE) advances including RoCEv2 progress, as well as lossless Ethernet advanced flow control, improved congestion handling, hashing improvements, buffering, and advanced flow telemetry that improve switch capabilities, UEC openness and flexibility can deliver cost and reduced complexity advantages over InfiniBand.
IBM Brings Major Portfolio Credentials to the Intel Alliance and GenAI/AI Table
From my perspective, IBM has been a leader in the GenAI movement, seeing a twofold run rate in its GenAI business into the billions over the past six quarters. As such, the new Intel alliance is a strong confidence pick and confirms that Intel and IBM together have game and influence across enterprise GenAI/AI strategic decision-making.
Bolstering IBM’s watsonx platform and GenAI credentials is its significant progress across AI in developing and deploying its Granite models. These models, ranging from 3 billion to 34 billion parameters, offer cost-effective AI solutions that are easier to customize and tune. IBM’s decision to open-source the Granite models under Apache 2.0 licenses allows developers and enterprises to use these tools while maintaining control over their data and customizations.
The Granite models have been optimized for various tasks, including coding and natural language processing, demonstrating top performance in numerous benchmarks. By making these models available on platforms such as Hugging Face and GitHub, IBM fosters a collaborative environment that encourages innovation and accelerates AI adoption. The flexibility and cost-efficiency of these models are expected to drive widespread adoption and provide IBM with a competitive edge in the AI market. The strategy of a combination of both big and smaller models and them being open source is starting to pay off for IBM.
IBM’s approach to open-sourcing these models under Apache 2.0 licenses is a strategic move to democratize AI. By providing the source code and allowing modifications, IBM ensures that organizations can adapt the models to their unique needs, having greater innovation within the AI community. This openness builds trust and encourages widespread experimentation and customization, leading to new and improved AI applications. Too early to see whether Instruct Lab is material to earnings, but I am bullish on the approach.
What to Watch
Software and compatibility factors are integral to the long-term success of the Intel IBM collaboration. From my view, such factors have proven a top challenge for all NVIDIA competitors and the lack of OneAPI integration with Gaudi is a rate limiting factor. On the other hand, as noted, there are fast-expanding market opportunities based primarily upon robust overall demand, specific customer needs, and use case requirements.
From my view, the Intel IBM alliance does not pose a short-term competitive threat to NVIDIA, AMD, and other AI chip specialists. However, a <$1 billion estimated run-rate, pre-IBM deal, for Gaudi3 indicates there is fast-growing ecosystem-wide demand and competitor supply chain issues that enable Intel to pick up more business for its data center and AI (DCAI) unit.
Futurum Intelligence’s AI chipset data indicated robust growth in the XPU market segment, which is poised to grow faster than the GPU space in the coming years. While this growth surge will originate from a variety of homegrown cloud offerings, Intel Gaudi has the potential to be an alternative for some cloud providers and enterprises prioritizing lower cost, higher availability, and a different cost volume profit approach. This evolving competitive dynamics can bode positively, especially across on-prem and second-tier cloud use cases.
To fully realize the power of ecosystem-wide AI, I find that Intel and IBM are strategically committed to enabling an open and collaborative ecosystem that offers customers a variety of choices and accessible solutions. By combining Gaudi 3 AI accelerators and Xeon CPUs with IBM Cloud, the alliance is developing new AI capabilities and making inroads fulfilling the rapidly expanding need for cost-effective, secure, and innovative AI computing solutions.
See the complete press release on the Intel website.
Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.
Other insights from The Futurum Group:
Intel’s Q2 2024 Earnings Release: Navigating Challenges and Strategic Shifts
Intel Vision 2024: Intel Unleashes Gaudi 3 Led Enterprise AI Strategy
IBM’s Strategic Shift: Embracing AWS to Enhance Software Deployment
Author Information
Ron is an experienced, customer-focused research expert and analyst, with over 20 years of experience in the digital and IT transformation markets, working with businesses to drive consistent revenue and sales growth.
He is a recognized authority at tracking the evolution of and identifying the key disruptive trends within the service enablement ecosystem, including a wide range of topics across software and services, infrastructure, 5G communications, Internet of Things (IoT), Artificial Intelligence (AI), analytics, security, cloud computing, revenue management, and regulatory issues.
Prior to his work with The Futurum Group, Ron worked with GlobalData Technology creating syndicated and custom research across a wide variety of technical fields. His work with Current Analysis focused on the broadband and service provider infrastructure markets.
Ron holds a Master of Arts in Public Policy from University of Nevada — Las Vegas and a Bachelor of Arts in political science/government from William and Mary.