Menu

AWS ML Capacity Blocks

AWS ML Capacity Blocks

The Six Five team discusses AWS ML Capacity Blocks.

If you are interested in watching the full episode you can check it out here.

Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.

Transcript:

Patrick Moorhead: Yeah, it’s interesting in all this world of generative AI and all this stuff, it’s like gosh-

Daniel Newman: Are you still ML?

Patrick Moorhead: Are we still doing ML? Yes, we absolutely are doing ML and quite frankly, for narrower data sets than generative AI, which can be up to 100 petabytes at this point just for the training model, it’s more efficient and less expensive. So I saw that, Dan. I saw that.

Daniel Newman: I bit it.

Patrick Moorhead: Okay. No, this is good. But one of the challenges for let’s say smaller businesses and even startups is, how do I come in there and reserve enough GPU to do what I need to do? It’s not just one GPU. When you’re doing machine learning training, let’s say you’re trying to train a vision model or something like that, you need hundreds of GPUs that are interconnected in a logical way that have a singular memory plane to be able to do all that work on. So what Amazon did is they brought out what’s called Capacity Blocks for ML for machine learning workloads. What that does is you can actually schedule like a hotel, which says, “Hey, on January 5th, I need this much capacity for my GPUs maybe for this long.”

What they do is they reserve that, and these are EC2 Ultra Clusters, which means that they’re of the highest performance optimized for ML. Then they’re connected through what’s called EFA, which is Elastic Fabric Adapter, because we all know that it’s not just about what you can do on that rack, but what you can do across multiple racks. EFA gives you what they’re calling petabit-scale, non-blocking network, which essentially is to get your workload done a lot quicker, again, to have that plainer memory. So that’s it, baby, MSL Capacity Blocks. Look at this space. I see no reason why they also wouldn’t do 1000-node systems for foundational models, but that’s just a prediction.

Daniel Newman: Read my lips, this will get bigger. Pat, why I think this is also important is, I don’t know about you, but I’m really getting tired of having to always say, “I know everyone thinks AWS is behind on generative AI.” You know what I’m saying? I feel like if I don’t start…

Patrick Moorhead: I know it came up in even our interviews yesterday.

Daniel Newman: It came up, and I’m going cross-eyed. It’s like, look, AWS has more workloads on more computes across more geographies than I think pretty much all the hyperscalers combined. Now, again, that’s funny math, so don’t hold me to it. But the reason I’m pointing that out is computes and data and workloads are the impetus of AI. So AWS had a different strategy, and it wasn’t based upon necessarily completely locking it up with a single large language model in a closed architecture, which some others did early on, and we were able to get good out-the-gate marketing and story narrative leads. But AWS has taken that we’re going to be the open source. We’re going to be open in our approach, and we’ll offer some FMs, foundational models. We’ll offer Titan. We’ll offer Bedrock. We’ll offer API connectors. You can run multiple models same time, all that kind of stuff.

So Pat, I just chalked this up to AWS being AWS, offering lots of services, looking at how did you take all of its capacity, add value, and make it easier for companies to do the things that are going to need to be done. As we know, there’s going to be more and more smaller models, middle-sized models, foundational models, and the large language models are increasingly commoditized. We’re seeing a world where hugging faces GitHub for AI, and that’s where this is going. There’s competition for that, but I think it’s a good use of resources.

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

Related Insights
ServiceNow Bets on OpenAI to Power Agentic Enterprise Workflows
January 23, 2026

ServiceNow Bets on OpenAI to Power Agentic Enterprise Workflows

Keith Kirkpatrick, Research Director at Futurum, examines ServiceNow’s multi-year collaboration with OpenAI, highlighting a shift toward agentic AI embedded in core enterprise workflows....
Is Tesla’s Multi-Foundry Strategy the Blueprint for Record AI Chip Volumes
January 22, 2026

Is Tesla’s Multi-Foundry Strategy the Blueprint for Record AI Chip Volumes?

Brendan Burke, Research Director at Futurum, explores how Tesla’s dual-foundry strategy for its AI5 chip enables record production scale and could make multi-foundry production the new standard for AI silicon....
January 21, 2026

AI-Enabled Enterprise Workspace – Futurum Signal

The enterprise workspace is entering a new phase—one shaped less by device refresh cycles and more by intelligent integration. As AI-enabled PCs enter the mainstream, the real challenge for IT...
AWS European Sovereign Cloud Debuts with Independent EU Infrastructure
January 16, 2026

AWS European Sovereign Cloud Debuts with Independent EU Infrastructure

Nick Patience, AI Platforms Practice Lead at Futurum, shares his/her insights on AWS’s launch of its European Sovereign Cloud. It is an independently-run cloud in the EU aimed at meeting...
Synopsys and GlobalFoundries Reshape Physical AI Through Processor IP Unbundling
January 16, 2026

Synopsys and GlobalFoundries Reshape Physical AI Through Processor IP Unbundling

Brendan Burke, Research Director at Futurum, evaluates GlobalFoundries’ acquisition of Synopsys’ Processor IP to lead in specialized silicon for Physical AI. Synopsys pivots to a neutral ecosystem strategy, prioritizing foundation...
Qualcomm Unveils Future of Intelligence at CES 2026, Pushes the Boundaries of On-Device AI
January 16, 2026

Qualcomm Unveils Future of Intelligence at CES 2026, Pushes the Boundaries of On-Device AI

Olivier Blanchard, Research Director at Futurum, shares his/her insights on Qualcomm’s CES 2026 announcements, which highlight both the breadth of Qualcomm’s Snapdragon and Dragonwing portfolios, and the velocity with which...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.