Google Cloud’s TPU v5e Accelerates the AI Compute War

Google Cloud’s TPU v5e Accelerates the AI Compute War

The News: On August 29, as part of Google Cloud Next ‘23, Google Cloud announced the preview of its next-generation AI chip, the Cloud TPU v5e. TPUs, or Tensor Processing Units, were invented by Google and specifically designed for AI workloads, both for AI training and inference. The first Google TPUs were made available outside of Google use in 2018.

Here are some of the other pertinent details:

  • Google Cloud says TPU v5e delivers up to 2x higher training performance per dollar and 2.5x higher inference performance per dollar for large language models (LLMs) and generative AI models when compared to Google Cloud TPU v4.
  • Google Cloud says TPU v5e will cost less than half of TPU v4.
  • TPU v5e is versatile, supporting eight different virtual machine configurations, from one chip to more than 250 chips. This versatility enables configurations for a wide range of LLM and generative AI model sizes.
  • TPU v5e chips are now powering (as opposed to being in preview) Google Cloud’s Kubernetes service, Cloud TPUs in GKE, and Google Cloud’s managed AI service, Vertex AI.

Read the full post on the introduction of Google Cloud’s TPU v5e chip on the Google Cloud blog.

Google Cloud’s TPU v5e Accelerates the AI Compute War

Analyst Take: Current AI workloads are big and expensive, putting pressure on chipmakers and cloud providers to find cheaper, better faster ways to enable AI. A market ecosystem is emerging to address this, from AI-specific designed chips such as TPUs, language processing units (LPUs), neural processing units (NPUs), and edge-focused silicon to redesigned data centers and the possible resurrection of on-premises compute.

Business leaders understandably worry about cost to understand ROI, profit margins, etc. AI compute costs are completely nebulous right now for two reasons: the tech is still experimental and being refined (so it will scale), and a slew of players want to handle enterprise AI compute. Where this goes and how it ends up is tricky.

Google Cloud is a key player in the entire AI ecosystem stack, particularly in AI compute. The latest Google Cloud TPU will have an impact on AI compute economics. Here’s how:

More Efficient, Cheaper Compute

While the world marvels at what generative AI can do, CIOs and other IT leaders are scrambling to find reasonable ways to run massive AI compute workloads required for AI outputs. Google’s TPUs continue to get faster, use less power, are more affordable than previous iterations and, importantly, compared to NVIDIA’s GPUs. As noted, Google Cloud says TPU v5e delivers up to 2x higher training performance per dollar and 2.5x higher inference performance per dollar for LLMs and generative AI models when compared to Google Cloud TPU v4. In April of this year, Google said that TPU v4 outperformed the TPU v3 by 2.1x and 2.7x better performance by watt. While these are not necessarily apples-to-apples performance comparisons, the point here is Google Cloud’s TPUs are getting better and better. If Google has the inventory and makes TPUs readily available, enterprises running their own AI compute workloads might have a more efficient and economic path to lower AI workload costs. Additionally, some enterprises might be more interested in Google Cloud’s managed AI compute options, Cloud TPUs in GKE, and Vertex AI for the same economic reasons.

Uneasy Partnership with NVIDIA

Ironically in the TPU announcement, co-authors Amin Vahdat, VP/GM ML, Systems, and Cloud AI and Mark Lohmeyer, VP/GM Compute and ML Infrastructure, go on to talk about Google Cloud and NVIDIA’s ongoing partnership – in this case, the new A3 virtual machines:

“Today, we’re thrilled to announce that A3 VMs will be generally available next month. Powered by NVIDIA’s H100 Tensor Core GPUs, which feature the Transformer Engine to address trillion-parameter models, NVIDIA’s H100 GPU, A3 VMs are purpose-built to train and serve especially demanding gen AI workloads and LLMs. Combining NVIDIA GPUs with Google Cloud’s leading infrastructure technologies provides massive scale and performance and is a huge leap forward in supercomputing capabilities, with 3x faster training and 10x greater networking bandwidth compared to the prior generation. A3 is also able to operate at scale, enabling users to scale models to tens of thousands of NVIDIA H100 GPUs.”

It will be interesting to see how this partnership evolves. Already, it appears there are moves by NVIDIA to protect its AI compute dominance with GPUs, as noted in our previous research note on the emergence of CoreWeave. The company is gaining significant investment and customers because it is a cloud provider built specifically for AI, not generalized, workloads, and is proving to be highly efficient with NVIDIA hardware. NVIDIA exclusively provides GPU firepower for CoreWeave.

In the near term, enterprises will likely invest in a range of AI compute options because there is not a proven path. AI compute must become more efficient and cheaper for AI applications to flourish.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

Groq Ushers In a New AI Compute Paradigm: The Language Processing Unit

AMD and Hugging Face Team Up to Democratize AI Compute – Shrewd Alliance Could Lead to AI Compute Competition, Lower AI Cost

CoreWeave Secures $2.3 Billion in Debt Financing, Challenges for AI Compute

Author Information

Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.

Related Insights
Agentic ERP Model
May 1, 2026

Can NetSuite’s Agentic ERP Model Survive the SaaS ‘Apocalypse’ and Win the Next AI Platform War?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Digital Workflows at Futurum, examines how NetSuite's agentic ERP model aims to deliver real AI ROI and counter the fragmenting...
Fusion Applications
May 1, 2026

Oracle Bets on Outcome-Driven AI Agents, But Will Enterprises Buy the Vision?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, examines Oracle's pivot toward AI agents embedded in Fusion Applications, analyzing enterprise demand for measurable business value,...
Marketplace Integration
May 1, 2026

Assessing Ingram Micro’s Q1 2026: Cyclical Growth or Structural Channel Shift?

Ingram Micro's Q1 2026 results show distributors must shift from logistics to marketplace orchestrators or risk disintermediation as CIOs consolidate platforms and adopt AI....
Microsoft Dynamics 365
May 1, 2026

Is Microsoft Dynamics 365 Contact Center the Catalyst for Agentic CX at Scale?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, Microsoft Dynamics 365 Contact Center's coordinated AI agents transform customer experience orchestration, challenging fragmented legacy solutions....
Enterprise Plan Manager
May 1, 2026

Will Smartsheet’s Contributor Seat Rewrite the Rules for Enterprise Collaboration Value?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, Smartsheet's Enterprise Plan Manager and Contributor seats challenge legacy pricing and accelerate vendor switching in enterprise collaboration....
Alphabet Q1 FY 2026 AI Demand Surges as Cloud Capacity Caps Growth
May 1, 2026

Alphabet Q1 FY 2026: AI Demand Surges as Cloud Capacity Caps Growth

Futurum Research analyzes Alphabet’s Q1 FY 2026 earnings, focusing on Cloud AI demand, Search monetization changes, and rising capacity investment tied to TPUs and infrastructure....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.