Qualcomm, AMD and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

The News: For the first time ever, a Qualcomm AI-based solution has broken the PetaOperations (PetaOps) Per Second Barrier for AI Inferencing. The record was achieved through a partnership between Qualcomm, AMD, and Gigabyte, by pairing AMD’s EPYC 7003 processor and Qualcomm’s Cloud AI 100 solution in a Gigabyte G292-Z43 server. Read the full release from Qualcomm here.

Qualcomm, AMD, and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

Analyst Take: The news that Qualcomm, AMD, and Gigabyte break the PetaOperations per second barrier for AI inferencing is big, and with good reason. AI is increasingly becoming the driving force behind the next generation of consumer experiences. In fact, almost all mobile experiences involve some type of AI intervention, from the way that shopping apps deliver customized recommendations based on tens of thousands of AI inferences, to how streaming apps curate categories and titles to best match every individual user’s tastes and mood.

These types of platforms serve millions of users every single day and achieving that kind of workflow at scale requires rack upon rack of powerful servers that can deliver the kind of AI inferencing performance that will keep these platforms running smoothly. U.S. chipmaker Qualcomm is addressing this need by enabling a server rack that can meet these high-performance requirements by pairing its Cloud AI 100 solution with the latest AMD EPYC 7003 Series processors and Gigabyte’s latest G292-Z43 server solutions. This amalgamation of hardware expertise offers incredible performance and raises the bar for the modern data center.

The Gigabyte G292-Z43 server supports two 3rd generation AMD EPYC 7003 series processors for its processing power, alongside 16 Qualcomm Cloud AI 100 cards for computationally intensive applications supporting inferencing workloads. Qualcomm’s Cloud AI 100 fits perfectly into Gigabyte’s server system and is capable of driving a wide range of AI use cases, from high-speed data analysis and personalized recommendations, to smart cities, 5G communications and more.

To gauge the kind of performance that this setup can deliver, Qualcomm’s Cloud AI 100 inference accelerator in just one Cloud AI 100 card can push 400 TOPS (trillion operations per second) at a pretty slick 75 watts. Since each Gigabyte server can host up to 16 Qualcomm Cloud AI 100 inferencing cards, each server can deliver up to 6.4 Peta OPS (POPS), which adds up to one thousand trillion operations per second.

And since a single server rack can host 19 or more of these server units, a rack can exceed 100 PetaOPS. That is a lot of AI muscle. To put this into a clearer use case context, a single 400 TOPS HHHL Qualcomm Cloud AI 100 inference card can, for example, drive around 19,000 Resnet50 images/sec.

AMD Goes After Intel Xeon with 3rd Generation EPYC 7003 CPU

For its part, AMD’s new EPYC 7003 processor improves per-cycle performance over its predecessor by 19% and delivers 2x performance for 8-bit AI inference processing operations. This, AMD claims, means that the EPYC 7003 delivers “the world’s fastest performance per chip and per core,” which I see as a clear shot across Intel’s bow when it comes to cloud, enterprise, and HPC workloads. That is because cloud service providers have typically been able to use AMD processors for pretty much everything except AI inferencing. For that, the solution of choice tended to be Intel’s Xeon processors. Evidently, AMD is looking to give Intel a run for its money on that front starting this year.
Speaking of money, AMD claims that its new EPYC processors will lower total cost of ownership (TCO) by 35% compared to Intel. If true, this could help not only speed up adoption of AMD’s EPYC 7003 processors but help AMD gain market share among CSPs, who remain heavy buyers of CPUs.

Caveat: This is not the first time that AMD has boasted of lower TCOs based on high core-to-server ratios, despite per-core absolute performance disadvantages vs Intel equivalents. Having said that, that per-core performance disadvantage no longer appears to be an issue.

Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.

Read more analysis from Futurum Research:

Qualcomm’s Snapdragon Insiders Program Launch Puts Snapdragon Brand Front-And-Center 

AMD Outperforms For Q4 And Year Delivering Strong Growth

Shifting Into High Gear: Exploring Qualcomm’s Automotive Announcements With Nakul Duggal – Futurum Tech Webcast Interview Series

Author Information

Olivier Blanchard

Olivier Blanchard is Research Director, Intelligent Devices. He covers edge semiconductors and intelligent AI-capable devices for Futurum. In addition to having co-authored several books about digital transformation and AI with Futurum Group CEO Daniel Newman, Blanchard brings considerable experience demystifying new and emerging technologies, advising clients on how best to future-proof their organizations, and helping maximize the positive impacts of technology disruption while mitigating their potentially negative effects. Follow his extended analysis on X and LinkedIn.

Related Insights
At Snowflake Summit, the ‘Snowmentum’ Was Palpable
June 4, 2026

At Snowflake Summit, the ‘Snowmentum’ Was Palpable

Nick Patience, VP & Practice Lead at Futurum, shares his insights on the recent Snowflake Summit 2026 event, where, with CoCo, CoWork, and a deepened partnership with Anthropic, Snowflake moves...
HPE Q2 FY 2026: AI Orders Remain Strong as Supply Constraints Persist
June 4, 2026

HPE Q2 FY 2026: AI Orders Remain Strong as Supply Constraints Persist

Futurum Research analyzes HPE Q2 FY 2026 earnings, focusing on AI-driven demand across servers and networking, supply constraints affecting conversion, and what updated FY 2026 and FY 2027 guidance implies...
Intel’s COMPUTEX Keynote Reframes an Iconic Company as a Silicon-to-Systems AI Lab
June 4, 2026

Intel’s COMPUTEX Keynote Reframes an Iconic Company as a Silicon-to-Systems AI Lab

Brendan Burke, Research Director at Futurum, examines the Intel agentic AI pivot at COMPUTEX 2026, where Xeon 6+ on 18A, Rackscale Blueprints, and a Perplexity hybrid demo reframe the CPU...
Databricks Genie and Partners Target Enterprise AI's Real Bottleneck: Cross-Functional Intelligence
June 4, 2026

Databricks Genie and Partners Target Enterprise AI’s Real Bottleneck: Cross-Functional Intelligence

Databricks Genie launches production-grade conversational AI to address enterprises' top challenge: AI reliability. Governed, context-aware insights help overcome critical adoption barriers across business functions....
Microsoft Build 2026 - The Platform, Integration Plane, and Developer Surface
June 4, 2026

Microsoft Build 2026 – The Platform, Integration Plane, and Developer Surface

Futurum Analysts Ashley, Kirkpatrick, Patience, and Shimmin analyze Microsoft Build 2026 across models, agents, data intelligence, governance, and silicon as Microsoft positions itself as the platform, the integration plane, and...
Agentic AI
June 3, 2026

Salesforce Bets on Agentic Marketing: Will Unified AI Agents Redefine Martech ROI?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, analyzes how Salesforce's agentic AI marketing platform leverages collaborative agents for campaign creation, lead qualification, and customer...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.