The News: Intel announced Intel Xeon 6 processors, Gaudi accelerators, and Lunar Lake architecture to accelerate the AI ecosystem across cloud, networking, client, and edge solutions. Read the full press release on the Intel website.
Computex 2024: Intel Catalyzes AI Everywhere with Key AI Portfolio Innovations
Analyst Take: Intel unveiled new solutions and architecture developments aimed at accelerating the AI ecosystem across the continuum of data center, cloud, network, edge, and PC environments. Through more processing power, power efficiency advances, and low total cost of ownership (TCO), Intel is aiming to enable customers to capitalize on the full AI ecosystem opportunity.
Key to Intel advancing its AI Everywhere vision and portfolio development strategy is the unveiling of three key portfolio initiatives at Computex 2024. First, the company launched Intel Xeon 6 processors with Efficient-cores (E-cores), designed to deliver performance and power efficiency for high-density, scale-out workloads in the data centers. Xeon 6 can enable 3:1 rack consolidation, rack-level performance gains of up to 4.2x, and performance per watt gains of up to 2.6x.
Second, Intel announced pricing for Intel Gaudi 2 and Intel Gaudi 3 AI accelerator kits, built to deliver high performance with up to one-third lower cost compared to competitive platforms. Intel is highlighting that the combination of Xeon processors with Gaudi 3 accelerators in a system can offer a solution for making AI faster, more cost effective, and more accessible.
Third, Intel unveiled its Lunar Lake client processor with the objective of continuing to grow the swiftly emerging AI PC category. Intel is emphasizing that the next generation of AI PCs, using x86 power efficiency and application compatibility gains, can deliver up to 40% lower system-on-chip (SoC) power when compared to the prior generation.
Intel Xeon 6 Platform: Massive New Customer Advantages and Mass Appeal
The Intel Xeon 6 platform launch is accompanied with a new branding approach that may generate some short-term confusion in relation to Intel’s previous branding approach, although I anticipate that will quickly recede as customers fully understand the benefits of the offering. Specifically, the entire Xeon 6 platform and family of processors is purpose-built to primarily address fast-evolving data center challenges with both E-core (Efficient-core) and P-core (Performance-core) SKUs to address the vast array of use cases and workloads, especially AI as well as other high-performance computing (HPC) requirements and scalable cloud-native applications.
Both E-cores and P-cores are constructed using a compatible architecture, sharing a common software stack and an open ecosystem of hardware and software vendors. Intel Xeon 6 processors come in two families: Sierra Forest and Granite Rapids.
Key Sierra Forest (E-Core) details include the following:
- Available now
- Up to 144 cores
- Uses the new Intel 3 process node for power and performance gains
- Designed for AI workloads, web scale-out containerized microservices, networking, content delivery networks, and cloud services
- Extreme focus on power efficiency
- Ready to face off against AMD’s density-focused EPYC Turin models
- Slated to expand in Q1 calendar year 2025 with network and edge-optimized variants
Key Granite Rapids (P-core) details include the following:
- Expected to launch in Q2 2024
- Up to 86 P-cores initially, expanding to 128 cores next year
- Designed for AI, latency-sensitive work, high single-core performance, HPC, and general workloads
- Also aimed at AMD’s performance-focused Turin models
Both processors fit into the Birch Stream platform, with different swimlanes based on core counts and power requirements. Intel Xeon 6 aims to assure less power use and rack space that liberates compute capacity and infrastructure for stimulating ecosystem-wide AI innovation and use case fulfillment. Plus, the new offerings can provide tremendous value in supporting non-AI workloads, especially as there is a massive swath of mass appeal applications that require cores that can provide substantial cost savings and value specific to the customer’s needs.
Specifically, by using less power and rack space, Xeon 6 processors can liberate compute capacity and infrastructure to help catalyze AI project productivity and innovation. The E-core (Sierra Forest) processors feature up to 144 cores, support for up to 8-channel DDR5-6400, and 88 PCIe Gen 5.0 lanes. From my view, a key breakthrough is they operate at a 250W TDP and avoid using simultaneous multithreading (SMT), also countering Arm’s practice of SMT non-support.
As such, Intel ensures x86 cores are ready to transition existing workloads without costly, complex architectural changes. This applies to users of 1st/2nd-generation Xeon Scalable virtualization or container hosts as well as Xeon E5 servers particularly as Xeon 6 (Sierra Forest) provides substantial consolidation benefits.
This includes offering up to 288 E-cores per dual-socket server, resulting in 2.7x higher performance per rack for 5G core workloads while consuming up to 30% less power when used with Intel Infrastructure Power Manager (IPM) software. Such power savings validate how customers gain the flexibility to invest in additional AI servers in accord with scaling traditional computing demands (i.e., non-AI).
Intel Gaudi AI Accelerators: Lowering the TCO for High Performance GenAI
Through the company’s portfolio development execution, Intel Xeon processors are better positioned to serve as the CPU head node for AI workloads and operate in a system with Intel Gaudi AI accelerators, which are purpose-built for AI workloads. Together, Intel can deliver a solution that eases adoption and integration into extensive x86 architectures.
The Gaudi architecture, as the sole MLPerf-benchmarked alternative to NVIDIA’s H100 for training and inference of large language models (LLMs), offers GenAI performance with a price-performance advantage. As such, this choice allows for swift deployment at a lower total cost of operation. Specifically, the Gaudi 3 AI accelerator is purpose-built for both training and inference tasks, offering 1.8 PFlops of FP8 and BF16 compute, 128 GB of HBM2e memory capacity, and 3.7 TB/s of HBM bandwidth. Compared to the Intel Gaudi 2 accelerator, it is projected to be 50% faster in time-to-train across models such as Llama 2 (7B and 13B parameters) and GPT-3 (175B parameters).
Plus, Intel’s Gaudi 3 accelerator, when deployed in an 8,192-accelerator cluster, is projected to provide up to 40% faster time-to-train compared to an equivalent-sized NVIDIA H100 GPU cluster. For a 64-accelerator cluster, Gaudi 3 offers up to 15% faster training throughput compared to NVIDIA H100. Moreover, the Intel Gaudi 3 is projected to offer an average of up to twice as fast inferencing versus NVIDIA H100, running well-known LLMs such as Mistral-7B and Llama-70B.
On price advantages, the Gaudi 2 accelerator kits, which include eight Gaudi 2 accelerators alongside a universal baseboard (UBB), are offered to system providers at a list price of US$65K, estimated to be approximately one-third the cost of comparable competitive platforms (i.e., NVIDIA).
Moreover, an AI kit comprising eight Intel Gaudi 3 accelerators with a UBB is priced at US$125K. This cost is estimated to be approximately two-thirds of what comparable competitive platforms would charge (i.e., NVIDIA). From my view, Intel demonstrating sharp price-performance advantages is a most refreshing sales and marketing strategy, which will oblige competitors to adjust their product development and marketing approach, including potentially significant price reductions. Already Supermicro is selling a full Gaudi 2 server for US$90K. This is all good news for customers as well as the overall AI ecosystem.
Intel Lunar Lake: Ushering in the AI PC Era
Intel’s Lunar Lake debut offers AI PC breakthroughs and exceptional timing. Lunar Lake is designed to deliver up to 40% lower SoC power and more than threefold the AI compute alongside the expectation that it will ship in Q3 2024, in time to help stimulate the Q4 holiday buying season.
Lunar Lake’s all-new architecture can enable the following:
- New P-cores and E-cores can deliver significant performance and energy efficiency improvements.
- A 4th-generation Intel neural processing unit (NPU) with up to 48 tera-operations per second (TOPS) of AI performance. This NPU delivers up to 4x AI compute over the previous generation, enabling corresponding improvements in GenAI.
- The all-new Xe2-powered GPU design combines new innovations: 2nd-generation Xe cores with Xe Matrix Extension (XMX) for AI, enhanced ray tracing units, low-power hardware decode for the new VVC video codec technology and support for the latest eDP 1.5 panels. The Xe2 GPU cores improve gaming and graphics performance by 1.5x over the previous generation, while the new XMX arrays enable a second AI accelerator with up to 67 TOPS of performance for throughput gains in AI content creation.
- Advanced low-power island, a novel compute cluster and Intel capability that handles background and productivity tasks with extreme efficiency, enabling improved laptop battery life.
In my view, the Lunar Lake debut can deliver essential advances across AI optimization, battery life, security, and a wide array of AI PC applications, driven by new capabilities such as the 40+ TOPS NPU key to delivering AI experiences at scale. Intel provides a comprehensive AI Everywhere vision and portfolio development strategy that is key to optimizing cohesive software and hardware innovation across the AI PC realm. Already Intel is collaborating with more than 100 ISVs to augment AI PC experiences for fast evolving AI capabilities such as personal assistants, content creation, gaming, security, video collaboration, and more.
Key Takeaway: Intel Uplifts AI Everywhere at Computex
I believe that the Intel AI Everywhere proposition is fully prepared to drive AI innovation throughout the entire continuum of the AI ecosystem and market, encompassing edge and data center systems, network, AI PCs, and semiconductor manufacturing. Intel’s latest Xeon, Gaudi, and Lunar Lake platform innovations, reinforced by its field-proven extensive software and hardware ecosystem, can fulfill the unique challenges of the AI era with cost-advantageous, secure, and agile solutions for all its customers and partners.
Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.
Other Insights from The Futurum Group:
Intel Vision 2024: Intel Unleashes Gaudi 3 Led Enterprise AI Strategy
Intel Q1 2024 Results: New Reporting Structure with Top Bottom Beat
Author Information
Ron is an experienced, customer-focused research expert and analyst, with over 20 years of experience in the digital and IT transformation markets, working with businesses to drive consistent revenue and sales growth.
He is a recognized authority at tracking the evolution of and identifying the key disruptive trends within the service enablement ecosystem, including a wide range of topics across software and services, infrastructure, 5G communications, Internet of Things (IoT), Artificial Intelligence (AI), analytics, security, cloud computing, revenue management, and regulatory issues.
Prior to his work with The Futurum Group, Ron worked with GlobalData Technology creating syndicated and custom research across a wide variety of technical fields. His work with Current Analysis focused on the broadband and service provider infrastructure markets.
Ron holds a Master of Arts in Public Policy from University of Nevada — Las Vegas and a Bachelor of Arts in political science/government from William and Mary.