Menu

Qualcomm, AMD and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

The News: For the first time ever, a Qualcomm AI-based solution has broken the PetaOperations (PetaOps) Per Second Barrier for AI Inferencing. The record was achieved through a partnership between Qualcomm, AMD, and Gigabyte, by pairing AMD’s EPYC 7003 processor and Qualcomm’s Cloud AI 100 solution in a Gigabyte G292-Z43 server. Read the full release from Qualcomm here.

Qualcomm, AMD, and Gigabyte Break the PetaOperations Per Second Barrier for AI Inferencing

Analyst Take: The news that Qualcomm, AMD, and Gigabyte break the PetaOperations per second barrier for AI inferencing is big, and with good reason. AI is increasingly becoming the driving force behind the next generation of consumer experiences. In fact, almost all mobile experiences involve some type of AI intervention, from the way that shopping apps deliver customized recommendations based on tens of thousands of AI inferences, to how streaming apps curate categories and titles to best match every individual user’s tastes and mood.

These types of platforms serve millions of users every single day and achieving that kind of workflow at scale requires rack upon rack of powerful servers that can deliver the kind of AI inferencing performance that will keep these platforms running smoothly. U.S. chipmaker Qualcomm is addressing this need by enabling a server rack that can meet these high-performance requirements by pairing its Cloud AI 100 solution with the latest AMD EPYC 7003 Series processors and Gigabyte’s latest G292-Z43 server solutions. This amalgamation of hardware expertise offers incredible performance and raises the bar for the modern data center.

The Gigabyte G292-Z43 server supports two 3rd generation AMD EPYC 7003 series processors for its processing power, alongside 16 Qualcomm Cloud AI 100 cards for computationally intensive applications supporting inferencing workloads. Qualcomm’s Cloud AI 100 fits perfectly into Gigabyte’s server system and is capable of driving a wide range of AI use cases, from high-speed data analysis and personalized recommendations, to smart cities, 5G communications and more.

To gauge the kind of performance that this setup can deliver, Qualcomm’s Cloud AI 100 inference accelerator in just one Cloud AI 100 card can push 400 TOPS (trillion operations per second) at a pretty slick 75 watts. Since each Gigabyte server can host up to 16 Qualcomm Cloud AI 100 inferencing cards, each server can deliver up to 6.4 Peta OPS (POPS), which adds up to one thousand trillion operations per second.

And since a single server rack can host 19 or more of these server units, a rack can exceed 100 PetaOPS. That is a lot of AI muscle. To put this into a clearer use case context, a single 400 TOPS HHHL Qualcomm Cloud AI 100 inference card can, for example, drive around 19,000 Resnet50 images/sec.

AMD Goes After Intel Xeon with 3rd Generation EPYC 7003 CPU

For its part, AMD’s new EPYC 7003 processor improves per-cycle performance over its predecessor by 19% and delivers 2x performance for 8-bit AI inference processing operations. This, AMD claims, means that the EPYC 7003 delivers “the world’s fastest performance per chip and per core,” which I see as a clear shot across Intel’s bow when it comes to cloud, enterprise, and HPC workloads. That is because cloud service providers have typically been able to use AMD processors for pretty much everything except AI inferencing. For that, the solution of choice tended to be Intel’s Xeon processors. Evidently, AMD is looking to give Intel a run for its money on that front starting this year.
Speaking of money, AMD claims that its new EPYC processors will lower total cost of ownership (TCO) by 35% compared to Intel. If true, this could help not only speed up adoption of AMD’s EPYC 7003 processors but help AMD gain market share among CSPs, who remain heavy buyers of CPUs.

Caveat: This is not the first time that AMD has boasted of lower TCOs based on high core-to-server ratios, despite per-core absolute performance disadvantages vs Intel equivalents. Having said that, that per-core performance disadvantage no longer appears to be an issue.

Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.

Read more analysis from Futurum Research:

Qualcomm’s Snapdragon Insiders Program Launch Puts Snapdragon Brand Front-And-Center 

AMD Outperforms For Q4 And Year Delivering Strong Growth

Shifting Into High Gear: Exploring Qualcomm’s Automotive Announcements With Nakul Duggal – Futurum Tech Webcast Interview Series

Author Information

Olivier Blanchard

Olivier Blanchard is Research Director, Intelligent Devices. He covers edge semiconductors and intelligent AI-capable devices for Futurum. In addition to having co-authored several books about digital transformation and AI with Futurum Group CEO Daniel Newman, Blanchard brings considerable experience demystifying new and emerging technologies, advising clients on how best to future-proof their organizations, and helping maximize the positive impacts of technology disruption while mitigating their potentially negative effects. Follow his extended analysis on X and LinkedIn.

Related Insights
Texas Instruments Buys Silicon Labs To Fuel Edge AI Scale
February 10, 2026

Texas Instruments Buys Silicon Labs To Fuel Edge AI Scale

Brendan Burke, Research Director at Futurum, examines Texas Instruments’ acquisition of Silicon Labs, assessing how manufacturing integration, portfolio scale, and cost discipline could reshape embedded wireless connectivity markets....
OpenAI Frontier Close the Enterprise AI Opportunity Gap—or Widen It
February 9, 2026

OpenAI Frontier: Close the Enterprise AI Opportunity Gap—or Widen It?

Futurum Research Analysts Mitch Ashley, Keith Kirkpatrick, Fernando Montenegro, Nick Patience, and Brad Shimmin examine OpenAI Frontier and whether enterprise AI agents can finally move from pilots to production. The...
Microchip Q3 2026 Earnings Signal Mix Shift to Networking and Data Center
February 9, 2026

Microchip Q3 2026 Earnings Signal Mix Shift to Networking and Data Center

Brendan Burke, Research Director at Futurum, analyzes Microchip Technology's Q3 fiscal 2026 results including broad-based market recovery and strategic wins in automotive Ethernet and Gen 6 PCIe switching....
Amazon Q4 FY 2025 Revenue Beat, AWS +24% Amid $200B Capex Plan
February 9, 2026

Amazon Q4 FY 2025: Revenue Beat, AWS +24% Amid $200B Capex Plan

Futurum Research reviews Amazon’s Q4 FY 2025 results, highlighting AWS acceleration from AI workloads, expanding custom silicon use, and an AI-led FY 2026 capex plan shaped by satellite and international...
Coherent Q2 FY 2026 AI Datacenter Demand Lifts Revenue and Margins
February 6, 2026

Coherent Q2 FY 2026: AI Datacenter Demand Lifts Revenue and Margins

Futurum Research analyzes Coherent’s Q2 FY 2026 results, highlighting AI datacenter optics demand, 6-inch indium phosphide capacity expansion, and growing OCS/CPO traction supporting margin expansion into FY 2027....
Arm Q3 FY 2026 Earnings Highlight AI-Driven Royalty Momentum
February 6, 2026

Arm Q3 FY 2026 Earnings Highlight AI-Driven Royalty Momentum

Futurum Research analyzes Arm’s Q3 FY 2026 results, highlighting CPU-led AI inference momentum, CSS-driven royalty leverage, and diversification across data center, edge, and automotive, with guidance pointing to continued growth....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.