The News: On September 19, as part of Intel Innovation 2023, Intel unveiled a series of announcements that define its ambition within the AI market ecosystem. Here are the details.
Intel Developer Cloud Reaches General Availability
This solution gives developers a path to test and deploy AI across the latest Intel central processing units (CPUs), graphics processing units (GPUs), and AI accelerators. Developers can also take advantage of tools to enable advanced AI and performance.
Built on a foundation of CPUs that are purpose-built for AI, GPUs, and Intel Gaudi2 processors for Deep Learning, Intel Developer Cloud also provides access to the latest Intel hardware platforms, 5th Gen Intel Xeon Scalable processors (code-named Emerald Rapids), and Intel Data Center GPU Max Series 1100 and 1550.
Developers can run small- to large-scale AI training, model optimization, and inference workloads. Based on an open software foundation with oneAPI, this solution provides hardware choice and freedom from proprietary programming models to support accelerated computing and code reuse and portability. Available AI foundation models include Falcon, Mosaic MPT, Bloom, Stable Diffusion, Llama 2, and Databrick’s Dolly. Toolkits, libraries, AI frameworks, and tooling are all from Intel. Tiered pricing is available, and there is a free tier for individual developers.
Data Center Compute Advances
A large AI supercomputer will be built entirely on Intel Xeon processors and 4,000 Intel Gaudi2 AI hardware accelerators, with Stability AI as the anchor customer. Dell Technologies and Intel are collaborating to offer AI solutions. PowerEdge systems with Xeon and Gaudi will support AI workloads ranging from large-scale training to base-level inferencing.
Alibaba Cloud has reported 4th Gen Xeon as a viable solution for real-time large language model (LLM) inference in its model-serving platform DashScope, with 4th Gen Xeon achieving a 3x acceleration in response time. And Granite Rapids will include Performance-cores (P-cores), offering better AI performance than any other CPU, and a 2x to 3x boost over 4th Gen Xeon for AI workloads.
AI for PCs
Intel will usher in the age of the AI PC with the upcoming Intel Core Ultra processors, code-named Meteor Lake, featuring Intel’s first integrated neural processing unit (NPU) for power-efficient AI acceleration and local inference on the PC. Core Ultra will launch December 14.
Core Ultra delivers low-latency AI compute that is connectivity-independent with stronger data privacy. It integrates an NPU into client silicon for the first time. The NPU is built to enable low power and high quality, ideal for workloads migrating from the CPU that need higher quality or efficiency, or for workloads that would typically run in the cloud due to lack of efficient client compute.
OpenVINO Runtime for AI at the Edge
OpenVINO is Intel’s AI inferencing and deployment runtime for developers on client and edge platforms. OpenVINO 2023.1, powered by oneAPI, makes generative AI more accessible for real-world scenarios, enabling developers to write once and deploy across a broad range of devices and AI applications.
OpenVINO 2023.1 enables developers to optimize standard PyTorch, TensorFlow, or ONNX models and offers full support for the forthcoming Core Ultra processors. It also provides more model compression techniques, improved GPU support, and memory consumption for dynamic shapes, as well as more portability and performance to run across the entire compute continuum: cloud, client, and edge.
Read the full Press Release on Intel’s AI Everywhere announcements here.
Intel AI Everywhere: Ambitious Vision For Tech Titan
Analyst Take: There is a lot at stake for Intel as the newest AI era unfolds. There has not been a chip purpose-built for the massive AI workloads as they stand. Most experts believe AI workloads will have to shrink and that models and chips must become more efficient for solutions to scale in price and cost. Extending that logic is moving AI workloads to the edge and to local devices. Edge proponents double-down on their arguments concerning the appeal – greater privacy and security. But it remains to be seen whether AI use cases will be properly suited for constricted compute or constricted data. Time, and developers, will tell.
Which brings us to Intel’s AI vision. Let us look at the impact of each announcement individually.
Intel Developer Cloud
The market will decide whether Intel’s collection of CPU, GPU, and other AI workload compute chips are a good match for applications, and Intel’s move to expand its developer platform can only drive the market forward. However, it appears Intel is treating the Developer Cloud as a viable revenue-generating product, as noted by the pricing tiers. Intel Developer Cloud will be competing with several other substantial AI developer platforms, including Hugging Face, GitHub, and NVIDIA.
Data Center and PC Compute Advances
In my opinion, the news around Intel’s data center compute advances are the most promising signs of Intel playing a leading role in AI compute. Although all efforts will be made to increase compute efficiencies, there is little doubt that most AI compute will happen in data centers for a while. Intel advances with data center compute workloads mean enterprises will have viable options to GPUs, and Intel is in the race for purpose-built AI compute. For further details, see Intel Gaudi2: A CPU Alternative to GPUs in the AI War? The advances on NPU for PCs shows the progress AI can make with the world’s second-most popular compute devices.
OpenVINO Runtime for AI at the Edge
The OpenVINO runtime is a shrewd strategy for promoting AI at the edge. It is wise of Intel to settle with an open source approach and let the chips fall where they will. Hopefully, there will not be friction and competition in this space because there are other chip manufacturers, particularly those for smartphones, who will offer competing plays.
Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.
Other insights from The Futurum Group:
Intel Built-in Acceleration Keys Optimize Data Center CPU Value
Intel Gaudi2: A CPU Alternative to GPUs in the AI War?
Intel 4th Gen Xeon Scalable Processors Primed to Accelerate Data Center Performance and Capabilities
Author Information
Mark comes to The Futurum Group from Omdia’s Artificial Intelligence practice, where his focus was on natural language and AI use cases.
Previously, Mark worked as a consultant and analyst providing custom and syndicated qualitative market analysis with an emphasis on mobile technology and identifying trends and opportunities for companies like Syniverse and ABI Research. He has been cited by international media outlets including CNBC, The Wall Street Journal, Bloomberg Businessweek, and CNET. Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.