The Six Five team discusses AWS ML Capacity Blocks.

If you are interested in watching the full episode you can check it out here.

Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.

Transcript:

Patrick Moorhead: Yeah, it’s interesting in all this world of generative AI and all this stuff, it’s like gosh-

Daniel Newman: Are you still ML?

Patrick Moorhead: Are we still doing ML? Yes, we absolutely are doing ML and quite frankly, for narrower data sets than generative AI, which can be up to 100 petabytes at this point just for the training model, it’s more efficient and less expensive. So I saw that, Dan. I saw that.

Daniel Newman: I bit it.

Patrick Moorhead: Okay. No, this is good. But one of the challenges for let’s say smaller businesses and even startups is, how do I come in there and reserve enough GPU to do what I need to do? It’s not just one GPU. When you’re doing machine learning training, let’s say you’re trying to train a vision model or something like that, you need hundreds of GPUs that are interconnected in a logical way that have a singular memory plane to be able to do all that work on. So what Amazon did is they brought out what’s called Capacity Blocks for ML for machine learning workloads. What that does is you can actually schedule like a hotel, which says, “Hey, on January 5th, I need this much capacity for my GPUs maybe for this long.”

What they do is they reserve that, and these are EC2 Ultra Clusters, which means that they’re of the highest performance optimized for ML. Then they’re connected through what’s called EFA, which is Elastic Fabric Adapter, because we all know that it’s not just about what you can do on that rack, but what you can do across multiple racks. EFA gives you what they’re calling petabit-scale, non-blocking network, which essentially is to get your workload done a lot quicker, again, to have that plainer memory. So that’s it, baby, MSL Capacity Blocks. Look at this space. I see no reason why they also wouldn’t do 1000-node systems for foundational models, but that’s just a prediction.

Daniel Newman: Read my lips, this will get bigger. Pat, why I think this is also important is, I don’t know about you, but I’m really getting tired of having to always say, “I know everyone thinks AWS is behind on generative AI.” You know what I’m saying? I feel like if I don’t start…

Patrick Moorhead: I know it came up in even our interviews yesterday.

Daniel Newman: It came up, and I’m going cross-eyed. It’s like, look, AWS has more workloads on more computes across more geographies than I think pretty much all the hyperscalers combined. Now, again, that’s funny math, so don’t hold me to it. But the reason I’m pointing that out is computes and data and workloads are the impetus of AI. So AWS had a different strategy, and it wasn’t based upon necessarily completely locking it up with a single large language model in a closed architecture, which some others did early on, and we were able to get good out-the-gate marketing and story narrative leads. But AWS has taken that we’re going to be the open source. We’re going to be open in our approach, and we’ll offer some FMs, foundational models. We’ll offer Titan. We’ll offer Bedrock. We’ll offer API connectors. You can run multiple models same time, all that kind of stuff.

So Pat, I just chalked this up to AWS being AWS, offering lots of services, looking at how did you take all of its capacity, add value, and make it easier for companies to do the things that are going to need to be done. As we know, there’s going to be more and more smaller models, middle-sized models, foundational models, and the large language models are increasingly commoditized. We’re seeing a world where hugging faces GitHub for AI, and that’s where this is going. There’s competition for that, but I think it’s a good use of resources.

Author Information

Daniel Newman

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

Analyze

Data & Intelligence

Advise

Research & Advisory

Amplify

Content & Campaigns

Assess

Testing, Labs & Validation

Practice Areas

Featured Insights

Futurum Research 2026: Key Issues and Predictions

2026 Research Agenda: Key Topics and Coverage Areas

Insights

Premium Insights

Newsletter

Media Partners

Podcasts

Video Series

Featured Insights

Can Epicor’s Indago-Karmak Integration Redefine Heavy-Duty Dealership Efficiency?

Will a U.S. Quantum Foundry Leverage $4.6 Billion in New Capital to Become the Next TSMC?

Futurum Group

Portfolio Companies

Featured Insights

Can Epicor’s Indago-Karmak Integration Redefine Heavy-Duty Dealership Efficiency?

Will a U.S. Quantum Foundry Leverage $4.6 Billion in New Capital to Become the Next TSMC?

Trusted by 100+ industry leaders

Featured Case Study

Scaling Smarter: How Google Cloud Marketplace Is Reshaping Partner Sales and GTM Strategy

Maximizing ROI with Agentic AI: Why Agentforce Is the Fast Path to Enterprise Value

Futurum and Kearney Reveal CEOs’ Readiness for AI Transformation in Landmark Study

Trusted by 100+ industry leaders

Featured Case Studies

AWS ML Capacity Blocks

Author Information

Daniel Newman

Can Epicor’s Indago-Karmak Integration Redefine Heavy-Duty Dealership Efficiency?

Will a U.S. Quantum Foundry Leverage $4.6 Billion in New Capital to Become the Next TSMC?

Can Zoom’s Agent Architect Redefine the AI Agent Lifecycle for Enterprise CX?

Will the First Wave of Synopsys Multiphysics Fusion Start the Next Wave of AI Chip Designs?

Analyze

Data & Intelligence

Advise

Research & Advisory

Amplify

Content & Campaigns

Assess

Testing, Labs & Validation

Practice Areas

Featured Insights

Futurum Research 2026: Key Issues and Predictions

2026 Research Agenda: Key Topics and Coverage Areas

Insights

Premium Insights

Newsletter

Media Partners

Podcasts

Video Series

Featured Insights

Can Epicor’s Indago-Karmak Integration Redefine Heavy-Duty Dealership Efficiency?

Will a U.S. Quantum Foundry Leverage $4.6 Billion in New Capital to Become the Next TSMC?

Futurum Group

Portfolio Companies

Featured Insights

Can Epicor’s Indago-Karmak Integration Redefine Heavy-Duty Dealership Efficiency?

Will a U.S. Quantum Foundry Leverage $4.6 Billion in New Capital to Become the Next TSMC?

Trusted by 100+ industry leaders

Featured Case Study

Scaling Smarter: How Google Cloud Marketplace Is Reshaping Partner Sales and GTM Strategy

Maximizing ROI with Agentic AI: Why Agentforce Is the Fast Path to Enterprise Value

Futurum and Kearney Reveal CEOs’ Readiness for AI Transformation in Landmark Study

AWS ML Capacity Blocks

Author Information

Welcome to The Futurum Group

Book a Demo

Newsletter Sign-up Form

Thank you, we received your request, a member of our team will be in contact with you.