Qualcomm, AMD and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

The News: For the first time ever, a Qualcomm AI-based solution has broken the PetaOperations (PetaOps) Per Second Barrier for AI Inferencing. The record was achieved through a partnership between Qualcomm, AMD, and Gigabyte, by pairing AMD’s EPYC 7003 processor and Qualcomm’s Cloud AI 100 solution in a Gigabyte G292-Z43 server. Read the full release from Qualcomm here.

Qualcomm, AMD, and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

Analyst Take: The news that Qualcomm, AMD, and Gigabyte break the PetaOperations per second barrier for AI inferencing is big, and with good reason. AI is increasingly becoming the driving force behind the next generation of consumer experiences. In fact, almost all mobile experiences involve some type of AI intervention, from the way that shopping apps deliver customized recommendations based on tens of thousands of AI inferences, to how streaming apps curate categories and titles to best match every individual user’s tastes and mood.

These types of platforms serve millions of users every single day and achieving that kind of workflow at scale requires rack upon rack of powerful servers that can deliver the kind of AI inferencing performance that will keep these platforms running smoothly. U.S. chipmaker Qualcomm is addressing this need by enabling a server rack that can meet these high-performance requirements by pairing its Cloud AI 100 solution with the latest AMD EPYC 7003 Series processors and Gigabyte’s latest G292-Z43 server solutions. This amalgamation of hardware expertise offers incredible performance and raises the bar for the modern data center.

The Gigabyte G292-Z43 server supports two 3rd generation AMD EPYC 7003 series processors for its processing power, alongside 16 Qualcomm Cloud AI 100 cards for computationally intensive applications supporting inferencing workloads. Qualcomm’s Cloud AI 100 fits perfectly into Gigabyte’s server system and is capable of driving a wide range of AI use cases, from high-speed data analysis and personalized recommendations, to smart cities, 5G communications and more.

To gauge the kind of performance that this setup can deliver, Qualcomm’s Cloud AI 100 inference accelerator in just one Cloud AI 100 card can push 400 TOPS (trillion operations per second) at a pretty slick 75 watts. Since each Gigabyte server can host up to 16 Qualcomm Cloud AI 100 inferencing cards, each server can deliver up to 6.4 Peta OPS (POPS), which adds up to one thousand trillion operations per second.

And since a single server rack can host 19 or more of these server units, a rack can exceed 100 PetaOPS. That is a lot of AI muscle. To put this into a clearer use case context, a single 400 TOPS HHHL Qualcomm Cloud AI 100 inference card can, for example, drive around 19,000 Resnet50 images/sec.

AMD Goes After Intel Xeon with 3rd Generation EPYC 7003 CPU

For its part, AMD’s new EPYC 7003 processor improves per-cycle performance over its predecessor by 19% and delivers 2x performance for 8-bit AI inference processing operations. This, AMD claims, means that the EPYC 7003 delivers “the world’s fastest performance per chip and per core,” which I see as a clear shot across Intel’s bow when it comes to cloud, enterprise, and HPC workloads. That is because cloud service providers have typically been able to use AMD processors for pretty much everything except AI inferencing. For that, the solution of choice tended to be Intel’s Xeon processors. Evidently, AMD is looking to give Intel a run for its money on that front starting this year.
Speaking of money, AMD claims that its new EPYC processors will lower total cost of ownership (TCO) by 35% compared to Intel. If true, this could help not only speed up adoption of AMD’s EPYC 7003 processors but help AMD gain market share among CSPs, who remain heavy buyers of CPUs.

Caveat: This is not the first time that AMD has boasted of lower TCOs based on high core-to-server ratios, despite per-core absolute performance disadvantages vs Intel equivalents. Having said that, that per-core performance disadvantage no longer appears to be an issue.

Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.

Read more analysis from Futurum Research:

Qualcomm’s Snapdragon Insiders Program Launch Puts Snapdragon Brand Front-And-Center 

AMD Outperforms For Q4 And Year Delivering Strong Growth

Shifting Into High Gear: Exploring Qualcomm’s Automotive Announcements With Nakul Duggal – Futurum Tech Webcast Interview Series

Author Information

Olivier Blanchard

Olivier Blanchard is Research Director, Intelligent Devices. He covers edge semiconductors and intelligent AI-capable devices for Futurum. In addition to having co-authored several books about digital transformation and AI with Futurum Group CEO Daniel Newman, Blanchard brings considerable experience demystifying new and emerging technologies, advising clients on how best to future-proof their organizations, and helping maximize the positive impacts of technology disruption while mitigating their potentially negative effects. Follow his extended analysis on X and LinkedIn.

Related Insights
Slackbot's MCP Client Aims to End App Fragmentation, But Can Slack Outmaneuver Microsoft Teams?
June 18, 2026

Slackbot’s MCP Client Aims to End App Fragmentation, But Can Slack Outmaneuver Microsoft Teams?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, examines how Slackbot's MCP Client aims to consolidate fragmented software stacks by integrating 20+ partner applications into...
Adobe's Creative Agent Expansion Raises the Bar for AI-Powered Creative Work
June 18, 2026

Adobe’s Creative Agent Expansion Raises the Bar for AI-Powered Creative Work

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, Adobe's Creative Agent expansion shows enterprise shift toward agentic AI, with 51% of organizations using AI for...
Can Glean's Financial Services Push Make AI Assistants a Compliance Asset, Not a Risk?
June 18, 2026

Can Glean’s Financial Services Push Make AI Assistants a Compliance Asset, Not a Risk?

Glean's Financial Services expansion positions its AI Assistant as a compliance-first solution for regulated industries, tackling reliability and privacy concerns while competing against Microsoft and Google in enterprise AI deployment....
Will Shared Memory Become the Missing Link for Enterprise-Scale Multi-Agent AI?
June 18, 2026

Will Shared Memory Become the Missing Link for Enterprise-Scale Multi-Agent AI?

Tabnine's shared memory architecture addresses fragmentation challenges in multi-agent AI development, providing enterprises with consistent, permission-aware context across codebases, documentation, and APIs as agentic AI adoption accelerates....
Agentic Workloads Reshape
June 17, 2026

How will Qualcomm’s AI Bet Solve for NVIDIA’s Data Center Gaps as Agentic Workloads Reshape the Chip Market?

Olivier Blanchard, Research Director & Practice Lead, Intelligent Devices at Futurum, on Qualcomm's Investor Day, and whether Qualcomm can challenge NVIDIA's data center dominance....
Adobe Brand Visibility
June 17, 2026

Adobe Brand Visibility Redefines the AI Search Battleground, Who Will Control Brand Presence in the Agentic Era?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, analyzes how Adobe Brand Visibility integrates Semrush AI search intelligence with agentic content optimization tools, positioning Adobe...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.