AWS ML Capacity Blocks

AWS ML Capacity Blocks

The Six Five team discusses AWS ML Capacity Blocks.

If you are interested in watching the full episode you can check it out here.

Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.

Transcript:

Patrick Moorhead: Yeah, it’s interesting in all this world of generative AI and all this stuff, it’s like gosh-

Daniel Newman: Are you still ML?

Patrick Moorhead: Are we still doing ML? Yes, we absolutely are doing ML and quite frankly, for narrower data sets than generative AI, which can be up to 100 petabytes at this point just for the training model, it’s more efficient and less expensive. So I saw that, Dan. I saw that.

Daniel Newman: I bit it.

Patrick Moorhead: Okay. No, this is good. But one of the challenges for let’s say smaller businesses and even startups is, how do I come in there and reserve enough GPU to do what I need to do? It’s not just one GPU. When you’re doing machine learning training, let’s say you’re trying to train a vision model or something like that, you need hundreds of GPUs that are interconnected in a logical way that have a singular memory plane to be able to do all that work on. So what Amazon did is they brought out what’s called Capacity Blocks for ML for machine learning workloads. What that does is you can actually schedule like a hotel, which says, “Hey, on January 5th, I need this much capacity for my GPUs maybe for this long.”

What they do is they reserve that, and these are EC2 Ultra Clusters, which means that they’re of the highest performance optimized for ML. Then they’re connected through what’s called EFA, which is Elastic Fabric Adapter, because we all know that it’s not just about what you can do on that rack, but what you can do across multiple racks. EFA gives you what they’re calling petabit-scale, non-blocking network, which essentially is to get your workload done a lot quicker, again, to have that plainer memory. So that’s it, baby, MSL Capacity Blocks. Look at this space. I see no reason why they also wouldn’t do 1000-node systems for foundational models, but that’s just a prediction.

Daniel Newman: Read my lips, this will get bigger. Pat, why I think this is also important is, I don’t know about you, but I’m really getting tired of having to always say, “I know everyone thinks AWS is behind on generative AI.” You know what I’m saying? I feel like if I don’t start…

Patrick Moorhead: I know it came up in even our interviews yesterday.

Daniel Newman: It came up, and I’m going cross-eyed. It’s like, look, AWS has more workloads on more computes across more geographies than I think pretty much all the hyperscalers combined. Now, again, that’s funny math, so don’t hold me to it. But the reason I’m pointing that out is computes and data and workloads are the impetus of AI. So AWS had a different strategy, and it wasn’t based upon necessarily completely locking it up with a single large language model in a closed architecture, which some others did early on, and we were able to get good out-the-gate marketing and story narrative leads. But AWS has taken that we’re going to be the open source. We’re going to be open in our approach, and we’ll offer some FMs, foundational models. We’ll offer Titan. We’ll offer Bedrock. We’ll offer API connectors. You can run multiple models same time, all that kind of stuff.

So Pat, I just chalked this up to AWS being AWS, offering lots of services, looking at how did you take all of its capacity, add value, and make it easier for companies to do the things that are going to need to be done. As we know, there’s going to be more and more smaller models, middle-sized models, foundational models, and the large language models are increasingly commoditized. We’re seeing a world where hugging faces GitHub for AI, and that’s where this is going. There’s competition for that, but I think it’s a good use of resources.

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

Related Insights
Autonomous Enterprise
April 24, 2026

Will ServiceNow and Google Cloud’s AI Agent Alliance Disrupt the Autonomous Enterprise Race?

ServiceNow and Google Cloud partnered to deliver AI agent solutions for autonomous enterprise operations, targeting 5G, retail, and IT sectors while raising concerns about vendor lock-in and scalability....
Google's $750M Partner Bet Resets the Agentic Channel Playbook
April 24, 2026

Google’s $750M Partner Bet Resets the Agentic Channel Playbook

Tiffani Bova at Futurum examines Google's $750M agentic AI partner commitment and new alliance formations with Accenture, Deloitte, Salesforce, and Vista Equity that reset channel program expectations....
Pegasystems Q1 FY 2026: Cloud ACV Nears $1 Billion Mark
April 24, 2026

Pegasystems Q1 FY 2026: Cloud ACV Nears $1 Billion Mark

Keith Kirkpatrick, Research Director with Futurum Research analyzes Pegasystems' Q1 FY 2026 earnings, focusing on Pega Cloud ACV growth nearing $1 billion, Blueprint AI's pipeline impact, and the enterprise AI...
Going Beyond the Data Graveyard With Google’s Agentic Data Cloud as the New Semantic Core for Agentic AI
April 24, 2026

Going Beyond the Data Graveyard With Google’s Agentic Data Cloud as the New Semantic Core for Agentic AI

Brad Shimmin, Analyst at Futurum, shares his insights on Google's new Agentic Data Cloud. See how this shift from passive storage to active intelligence helps organizations ditch manual data plumbing...
EDA Vendors Race to Align With TSMC's Angstrom-Era Roadmap at Technology Symposium
April 24, 2026

EDA Vendors Race to Align With TSMC’s Angstrom-Era Roadmap at Technology Symposium

Brendan Burke, Research Director at Futurum, examines how Synopsys, Cadence, and Siemens EDA align with TSMC's sub-2nm roadmap, revealing divergent strategies as COUPE co-packaged optics create a new multiphysics EDA...
Google Splits Its TPU Line to Enter the Era of Agentic Silicon
April 24, 2026

Google Splits Its TPU Line to Enter the Era of Agentic Silicon

Brendan Burke, Research Director at Futurum, examines Google's split of its eighth-generation TPU line into TPU 8t for training and TPU 8i for inference, signaling a structural shift toward workload-specific...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.