Qualcomm, AMD and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

The News: For the first time ever, a Qualcomm AI-based solution has broken the PetaOperations (PetaOps) Per Second Barrier for AI Inferencing. The record was achieved through a partnership between Qualcomm, AMD, and Gigabyte, by pairing AMD’s EPYC 7003 processor and Qualcomm’s Cloud AI 100 solution in a Gigabyte G292-Z43 server. Read the full release from Qualcomm here.

Qualcomm, AMD, and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

Analyst Take: The news that Qualcomm, AMD, and Gigabyte break the PetaOperations per second barrier for AI inferencing is big, and with good reason. AI is increasingly becoming the driving force behind the next generation of consumer experiences. In fact, almost all mobile experiences involve some type of AI intervention, from the way that shopping apps deliver customized recommendations based on tens of thousands of AI inferences, to how streaming apps curate categories and titles to best match every individual user’s tastes and mood.

These types of platforms serve millions of users every single day and achieving that kind of workflow at scale requires rack upon rack of powerful servers that can deliver the kind of AI inferencing performance that will keep these platforms running smoothly. U.S. chipmaker Qualcomm is addressing this need by enabling a server rack that can meet these high-performance requirements by pairing its Cloud AI 100 solution with the latest AMD EPYC 7003 Series processors and Gigabyte’s latest G292-Z43 server solutions. This amalgamation of hardware expertise offers incredible performance and raises the bar for the modern data center.

The Gigabyte G292-Z43 server supports two 3rd generation AMD EPYC 7003 series processors for its processing power, alongside 16 Qualcomm Cloud AI 100 cards for computationally intensive applications supporting inferencing workloads. Qualcomm’s Cloud AI 100 fits perfectly into Gigabyte’s server system and is capable of driving a wide range of AI use cases, from high-speed data analysis and personalized recommendations, to smart cities, 5G communications and more.

To gauge the kind of performance that this setup can deliver, Qualcomm’s Cloud AI 100 inference accelerator in just one Cloud AI 100 card can push 400 TOPS (trillion operations per second) at a pretty slick 75 watts. Since each Gigabyte server can host up to 16 Qualcomm Cloud AI 100 inferencing cards, each server can deliver up to 6.4 Peta OPS (POPS), which adds up to one thousand trillion operations per second.

And since a single server rack can host 19 or more of these server units, a rack can exceed 100 PetaOPS. That is a lot of AI muscle. To put this into a clearer use case context, a single 400 TOPS HHHL Qualcomm Cloud AI 100 inference card can, for example, drive around 19,000 Resnet50 images/sec.

AMD Goes After Intel Xeon with 3rd Generation EPYC 7003 CPU

For its part, AMD’s new EPYC 7003 processor improves per-cycle performance over its predecessor by 19% and delivers 2x performance for 8-bit AI inference processing operations. This, AMD claims, means that the EPYC 7003 delivers “the world’s fastest performance per chip and per core,” which I see as a clear shot across Intel’s bow when it comes to cloud, enterprise, and HPC workloads. That is because cloud service providers have typically been able to use AMD processors for pretty much everything except AI inferencing. For that, the solution of choice tended to be Intel’s Xeon processors. Evidently, AMD is looking to give Intel a run for its money on that front starting this year.
Speaking of money, AMD claims that its new EPYC processors will lower total cost of ownership (TCO) by 35% compared to Intel. If true, this could help not only speed up adoption of AMD’s EPYC 7003 processors but help AMD gain market share among CSPs, who remain heavy buyers of CPUs.

Caveat: This is not the first time that AMD has boasted of lower TCOs based on high core-to-server ratios, despite per-core absolute performance disadvantages vs Intel equivalents. Having said that, that per-core performance disadvantage no longer appears to be an issue.

Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.

Read more analysis from Futurum Research:

Qualcomm’s Snapdragon Insiders Program Launch Puts Snapdragon Brand Front-And-Center 

AMD Outperforms For Q4 And Year Delivering Strong Growth

Shifting Into High Gear: Exploring Qualcomm’s Automotive Announcements With Nakul Duggal – Futurum Tech Webcast Interview Series

Author Information

Olivier Blanchard

Olivier Blanchard is Research Director, Intelligent Devices. He covers edge semiconductors and intelligent AI-capable devices for Futurum. In addition to having co-authored several books about digital transformation and AI with Futurum Group CEO Daniel Newman, Blanchard brings considerable experience demystifying new and emerging technologies, advising clients on how best to future-proof their organizations, and helping maximize the positive impacts of technology disruption while mitigating their potentially negative effects. Follow his extended analysis on X and LinkedIn.

Related Insights
Oracle Makes the Case for AI Inside Everyday Leadership Workflows
July 2, 2026

Oracle Makes the Case for AI Inside Everyday Leadership Workflows

Keith Kirkpatrick, Research Director at The Futurum Group, examines how Oracle Manager Edge embeds AI-powered coaching into Oracle Cloud HCM, bringing real-time guidance into managers' daily workflows and strengthening Oracle's...
Domino Data Lab From MLOps Platform to Governed AI Application Factory
July 2, 2026

Domino Data Lab: From MLOps Platform to Governed AI Application Factory

Nick Patience, VP and Practice Lead, AI Platforms at Futurum, examines Domino Data Lab's pivot to governed AI application delivery, its agentic AI governance framework, and what the strategy means...
Siemens and IFS Announce Alliance to Advance Industrial AI
July 2, 2026

Siemens and IFS Announce Alliance to Advance Industrial AI

Siemens and IFS have partnered to advance Industrial AI solutions, merging Siemens' industrial automation depth with IFS's AI-embedded ERP platform. The alliance targets asset-intensive industries as enterprise software demand accelerates....
Lakebase and LTAP Challenge Database Orthodoxy, Are Monoliths Finally Obsolete?
July 2, 2026

Lakebase and LTAP Challenge Database Orthodoxy, Are Monoliths Finally Obsolete?

Databricks revolutionizes analytical platforms through Lakebase and LTAP, unifying transactional and analytical workloads. Research shows 73.6% of organizations are increasing spend, signaling a major shift from legacy databases....
Shopify’s PyTorch Foundation Move Signals a Power Shift in Open Source AI for Commerce
July 2, 2026

Shopify’s PyTorch Foundation Move Signals a Power Shift in Open Source AI for Commerce

Shopify's Platinum membership in the PyTorch Foundation signals a shift toward community-governed AI frameworks, avoiding vendor lock-in as enterprises increasingly deploy generative AI in production....
How Anthropic and OpenAI Are Building Everywhere Ecosystems
July 1, 2026

How Anthropic and OpenAI Are Building “Everywhere Ecosystems”

Alex Smith, VP & Practice Lead, Ecosystems, Channels & Marketplaces at Futurum, shares insights on how Anthropic and OpenAI are building 'Everywhere Ecosystems' and the multidimensional go-to-market strategies designed to...

Book a Demo

Welcome

The vision behind everything in Futurum’s Custom Research practice is this: research should show you what is happening, what comes next, and what to do about it. It should be personal to each audience, easy for people to grasp, and structured so LLMs can reason over it accurately. And it should be fast and turnkey; you want answers now, not another project to carry for quarters.

Whether you are defining business, channel, or go-to-market strategy; evaluating vendors or justifying ROI; or commissioning research to fill an emerging market need, we have your back, with a program that answers your questions with the objectivity and credibility to drive real decisions.

To do it, we bring unmatched data to bear: Futurum research, surveys, and market projections; validated market feeds; ETR’s 15 years of insight from 10,000 technology decision-makers; G2’s buyer and user data; and what our analysts hear every day. Add leading primary collection, from AI-moderated voice interviews to surveys and analyst-led interviews, all turnkey, and every project comes out credible, nuanced, and actionable.

And we don’t just drop the results in your lap. For internal work, we provide analyst-led sessions, interactive dashboards, and a range of formats. For market-facing work, Futurum delivers turnkey activation and amplification that actually gets seen, by people and by LLMs, through our media and share of voice. This is research that moves decisions and markets.

We will meet you wherever you are, from a fast-turn brief to a multi-year program, and shape the work to your goals, timeline, and budget. The right program for your moment.

If any of this is useful, I would love to talk.

Benjamin Brown, VP Custom Research, Futurum Research

Benjamin Brown

VP, Custom Research · The Futurum Group

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.