Menu

Google Cloud’s TPU v5e Accelerates the AI Compute War

Google Cloud’s TPU v5e Accelerates the AI Compute War

The News: On August 29, as part of Google Cloud Next ‘23, Google Cloud announced the preview of its next-generation AI chip, the Cloud TPU v5e. TPUs, or Tensor Processing Units, were invented by Google and specifically designed for AI workloads, both for AI training and inference. The first Google TPUs were made available outside of Google use in 2018.

Here are some of the other pertinent details:

  • Google Cloud says TPU v5e delivers up to 2x higher training performance per dollar and 2.5x higher inference performance per dollar for large language models (LLMs) and generative AI models when compared to Google Cloud TPU v4.
  • Google Cloud says TPU v5e will cost less than half of TPU v4.
  • TPU v5e is versatile, supporting eight different virtual machine configurations, from one chip to more than 250 chips. This versatility enables configurations for a wide range of LLM and generative AI model sizes.
  • TPU v5e chips are now powering (as opposed to being in preview) Google Cloud’s Kubernetes service, Cloud TPUs in GKE, and Google Cloud’s managed AI service, Vertex AI.

Read the full post on the introduction of Google Cloud’s TPU v5e chip on the Google Cloud blog.

Google Cloud’s TPU v5e Accelerates the AI Compute War

Analyst Take: Current AI workloads are big and expensive, putting pressure on chipmakers and cloud providers to find cheaper, better faster ways to enable AI. A market ecosystem is emerging to address this, from AI-specific designed chips such as TPUs, language processing units (LPUs), neural processing units (NPUs), and edge-focused silicon to redesigned data centers and the possible resurrection of on-premises compute.

Business leaders understandably worry about cost to understand ROI, profit margins, etc. AI compute costs are completely nebulous right now for two reasons: the tech is still experimental and being refined (so it will scale), and a slew of players want to handle enterprise AI compute. Where this goes and how it ends up is tricky.

Google Cloud is a key player in the entire AI ecosystem stack, particularly in AI compute. The latest Google Cloud TPU will have an impact on AI compute economics. Here’s how:

More Efficient, Cheaper Compute

While the world marvels at what generative AI can do, CIOs and other IT leaders are scrambling to find reasonable ways to run massive AI compute workloads required for AI outputs. Google’s TPUs continue to get faster, use less power, are more affordable than previous iterations and, importantly, compared to NVIDIA’s GPUs. As noted, Google Cloud says TPU v5e delivers up to 2x higher training performance per dollar and 2.5x higher inference performance per dollar for LLMs and generative AI models when compared to Google Cloud TPU v4. In April of this year, Google said that TPU v4 outperformed the TPU v3 by 2.1x and 2.7x better performance by watt. While these are not necessarily apples-to-apples performance comparisons, the point here is Google Cloud’s TPUs are getting better and better. If Google has the inventory and makes TPUs readily available, enterprises running their own AI compute workloads might have a more efficient and economic path to lower AI workload costs. Additionally, some enterprises might be more interested in Google Cloud’s managed AI compute options, Cloud TPUs in GKE, and Vertex AI for the same economic reasons.

Uneasy Partnership with NVIDIA

Ironically in the TPU announcement, co-authors Amin Vahdat, VP/GM ML, Systems, and Cloud AI and Mark Lohmeyer, VP/GM Compute and ML Infrastructure, go on to talk about Google Cloud and NVIDIA’s ongoing partnership – in this case, the new A3 virtual machines:

“Today, we’re thrilled to announce that A3 VMs will be generally available next month. Powered by NVIDIA’s H100 Tensor Core GPUs, which feature the Transformer Engine to address trillion-parameter models, NVIDIA’s H100 GPU, A3 VMs are purpose-built to train and serve especially demanding gen AI workloads and LLMs. Combining NVIDIA GPUs with Google Cloud’s leading infrastructure technologies provides massive scale and performance and is a huge leap forward in supercomputing capabilities, with 3x faster training and 10x greater networking bandwidth compared to the prior generation. A3 is also able to operate at scale, enabling users to scale models to tens of thousands of NVIDIA H100 GPUs.”

It will be interesting to see how this partnership evolves. Already, it appears there are moves by NVIDIA to protect its AI compute dominance with GPUs, as noted in our previous research note on the emergence of CoreWeave. The company is gaining significant investment and customers because it is a cloud provider built specifically for AI, not generalized, workloads, and is proving to be highly efficient with NVIDIA hardware. NVIDIA exclusively provides GPU firepower for CoreWeave.

In the near term, enterprises will likely invest in a range of AI compute options because there is not a proven path. AI compute must become more efficient and cheaper for AI applications to flourish.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

Groq Ushers In a New AI Compute Paradigm: The Language Processing Unit

AMD and Hugging Face Team Up to Democratize AI Compute – Shrewd Alliance Could Lead to AI Compute Competition, Lower AI Cost

CoreWeave Secures $2.3 Billion in Debt Financing, Challenges for AI Compute

Author Information

Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.

Related Insights
CIO Take Smartsheet's Intelligent Work Management as a Strategic Execution Platform
December 22, 2025

CIO Take: Smartsheet’s Intelligent Work Management as a Strategic Execution Platform

Dion Hinchcliffe analyzes Smartsheet’s Intelligent Work Management announcements from a CIO lens—what’s real about agentic AI for execution at scale, what’s risky, and what to validate before standardizing....
Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth
December 22, 2025

Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth?

Keith Kirkpatrick, Research Director with Futurum, shares his insights on Zoho’s latest finance-focused releases, Zoho Spend and Zoho Billing Enterprise Edition, further underscoring Zoho’s drive to illustrate its enterprise-focused capabilities....
NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy
December 16, 2025

NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy

Nick Patience, AI Platforms Practice Lead at Futurum, shares his insights on NVIDIA's release of its Nemotron 3 family of open-source models and the acquisition of SchedMD, the developer of...
Will a Digital Adoption Platform Become a Must-Have App in 2026?
December 15, 2025

Will a DAP Become the Must-Have Software App in 2026?

Keith Kirkpatrick, Research Director with Futurum, covers WalkMe’s 2025 Analyst Day, and discusses the company’s key pillars for driving success with enterprise software in an AI- and agentic-dominated world heading...
Broadcom Q4 FY 2025 Earnings AI And Software Drive Beat
December 15, 2025

Broadcom Q4 FY 2025 Earnings: AI And Software Drive Beat

Futurum Research analyzes Broadcom’s Q4 FY 2025 results, highlighting accelerating AI semiconductor momentum, Ethernet AI switching backlog, and VMware Cloud Foundation gains, alongside system-level deliveries....
Oracle Q2 FY 2026 Cloud Grows; Capex Rises for AI Buildout
December 12, 2025

Oracle Q2 FY 2026: Cloud Grows; Capex Rises for AI Buildout

Futurum Research analyzes Oracle’s Q2 FY 2026 earnings, highlighting cloud infrastructure momentum, record RPO, rising AI-focused capex, and multicloud database traction driving workload growth across OCI and partner clouds....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.