Qualcomm, AMD and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

The News: For the first time ever, a Qualcomm AI-based solution has broken the PetaOperations (PetaOps) Per Second Barrier for AI Inferencing. The record was achieved through a partnership between Qualcomm, AMD, and Gigabyte, by pairing AMD’s EPYC 7003 processor and Qualcomm’s Cloud AI 100 solution in a Gigabyte G292-Z43 server. Read the full release from Qualcomm here.

Qualcomm, AMD, and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

Analyst Take: The news that Qualcomm, AMD, and Gigabyte break the PetaOperations per second barrier for AI inferencing is big, and with good reason. AI is increasingly becoming the driving force behind the next generation of consumer experiences. In fact, almost all mobile experiences involve some type of AI intervention, from the way that shopping apps deliver customized recommendations based on tens of thousands of AI inferences, to how streaming apps curate categories and titles to best match every individual user’s tastes and mood.

These types of platforms serve millions of users every single day and achieving that kind of workflow at scale requires rack upon rack of powerful servers that can deliver the kind of AI inferencing performance that will keep these platforms running smoothly. U.S. chipmaker Qualcomm is addressing this need by enabling a server rack that can meet these high-performance requirements by pairing its Cloud AI 100 solution with the latest AMD EPYC 7003 Series processors and Gigabyte’s latest G292-Z43 server solutions. This amalgamation of hardware expertise offers incredible performance and raises the bar for the modern data center.

The Gigabyte G292-Z43 server supports two 3rd generation AMD EPYC 7003 series processors for its processing power, alongside 16 Qualcomm Cloud AI 100 cards for computationally intensive applications supporting inferencing workloads. Qualcomm’s Cloud AI 100 fits perfectly into Gigabyte’s server system and is capable of driving a wide range of AI use cases, from high-speed data analysis and personalized recommendations, to smart cities, 5G communications and more.

To gauge the kind of performance that this setup can deliver, Qualcomm’s Cloud AI 100 inference accelerator in just one Cloud AI 100 card can push 400 TOPS (trillion operations per second) at a pretty slick 75 watts. Since each Gigabyte server can host up to 16 Qualcomm Cloud AI 100 inferencing cards, each server can deliver up to 6.4 Peta OPS (POPS), which adds up to one thousand trillion operations per second.

And since a single server rack can host 19 or more of these server units, a rack can exceed 100 PetaOPS. That is a lot of AI muscle. To put this into a clearer use case context, a single 400 TOPS HHHL Qualcomm Cloud AI 100 inference card can, for example, drive around 19,000 Resnet50 images/sec.

AMD Goes After Intel Xeon with 3rd Generation EPYC 7003 CPU

For its part, AMD’s new EPYC 7003 processor improves per-cycle performance over its predecessor by 19% and delivers 2x performance for 8-bit AI inference processing operations. This, AMD claims, means that the EPYC 7003 delivers “the world’s fastest performance per chip and per core,” which I see as a clear shot across Intel’s bow when it comes to cloud, enterprise, and HPC workloads. That is because cloud service providers have typically been able to use AMD processors for pretty much everything except AI inferencing. For that, the solution of choice tended to be Intel’s Xeon processors. Evidently, AMD is looking to give Intel a run for its money on that front starting this year.
Speaking of money, AMD claims that its new EPYC processors will lower total cost of ownership (TCO) by 35% compared to Intel. If true, this could help not only speed up adoption of AMD’s EPYC 7003 processors but help AMD gain market share among CSPs, who remain heavy buyers of CPUs.

Caveat: This is not the first time that AMD has boasted of lower TCOs based on high core-to-server ratios, despite per-core absolute performance disadvantages vs Intel equivalents. Having said that, that per-core performance disadvantage no longer appears to be an issue.

Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.

Read more analysis from Futurum Research:

Qualcomm’s Snapdragon Insiders Program Launch Puts Snapdragon Brand Front-And-Center 

AMD Outperforms For Q4 And Year Delivering Strong Growth

Shifting Into High Gear: Exploring Qualcomm’s Automotive Announcements With Nakul Duggal – Futurum Tech Webcast Interview Series

Author Information

Olivier Blanchard has extensive experience managing product innovation, technology adoption, digital integration, and change management for industry leaders in the B2B, B2C, B2G sectors, and the IT channel. His passion is helping decision-makers and their organizations understand the many risks and opportunities of technology-driven disruption, and leverage innovation to build stronger, better, more competitive companies.  Read Full Bio.


Latest Insights:

The Futurum Group’s Guy Currier provides his insights into the advancements in the creation and operation of applications and their foundational data, along with AI, showcasing the rapid progress being made in cloud and application development.
Kubecon and the Vendors Lay Out Strategies for Driving AI
Camberley Bates, Vice President at The Futurum Group, covers the pressing issues of memory constraints and highlights from Memcon 2024.
Empowering Developers with Advanced AI Capabilities and Enhanced Data Analytics Solutions
Paul Nashawaty, Practice Lead at The Futurum Group, provides his insights on the transformative impact of Google's Data Cloud innovations and the implications for developers and enterprises navigating the evolving landscape of AI and data analytics.
Navigating the Future of AI: Analyst Perspectives on Google’s Latest Innovations and their Impact on Developers
Paul Nashawaty, Practice Lead at The Futurum Group, provides his insights into the transformative impact of Google's AI announcements at Google Next and their implications for the future of AI development and adoption.