Menu

AWS ML Capacity Blocks

AWS ML Capacity Blocks

The Six Five team discusses AWS ML Capacity Blocks.

If you are interested in watching the full episode you can check it out here.

Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.

Transcript:

Patrick Moorhead: Yeah, it’s interesting in all this world of generative AI and all this stuff, it’s like gosh-

Daniel Newman: Are you still ML?

Patrick Moorhead: Are we still doing ML? Yes, we absolutely are doing ML and quite frankly, for narrower data sets than generative AI, which can be up to 100 petabytes at this point just for the training model, it’s more efficient and less expensive. So I saw that, Dan. I saw that.

Daniel Newman: I bit it.

Patrick Moorhead: Okay. No, this is good. But one of the challenges for let’s say smaller businesses and even startups is, how do I come in there and reserve enough GPU to do what I need to do? It’s not just one GPU. When you’re doing machine learning training, let’s say you’re trying to train a vision model or something like that, you need hundreds of GPUs that are interconnected in a logical way that have a singular memory plane to be able to do all that work on. So what Amazon did is they brought out what’s called Capacity Blocks for ML for machine learning workloads. What that does is you can actually schedule like a hotel, which says, “Hey, on January 5th, I need this much capacity for my GPUs maybe for this long.”

What they do is they reserve that, and these are EC2 Ultra Clusters, which means that they’re of the highest performance optimized for ML. Then they’re connected through what’s called EFA, which is Elastic Fabric Adapter, because we all know that it’s not just about what you can do on that rack, but what you can do across multiple racks. EFA gives you what they’re calling petabit-scale, non-blocking network, which essentially is to get your workload done a lot quicker, again, to have that plainer memory. So that’s it, baby, MSL Capacity Blocks. Look at this space. I see no reason why they also wouldn’t do 1000-node systems for foundational models, but that’s just a prediction.

Daniel Newman: Read my lips, this will get bigger. Pat, why I think this is also important is, I don’t know about you, but I’m really getting tired of having to always say, “I know everyone thinks AWS is behind on generative AI.” You know what I’m saying? I feel like if I don’t start…

Patrick Moorhead: I know it came up in even our interviews yesterday.

Daniel Newman: It came up, and I’m going cross-eyed. It’s like, look, AWS has more workloads on more computes across more geographies than I think pretty much all the hyperscalers combined. Now, again, that’s funny math, so don’t hold me to it. But the reason I’m pointing that out is computes and data and workloads are the impetus of AI. So AWS had a different strategy, and it wasn’t based upon necessarily completely locking it up with a single large language model in a closed architecture, which some others did early on, and we were able to get good out-the-gate marketing and story narrative leads. But AWS has taken that we’re going to be the open source. We’re going to be open in our approach, and we’ll offer some FMs, foundational models. We’ll offer Titan. We’ll offer Bedrock. We’ll offer API connectors. You can run multiple models same time, all that kind of stuff.

So Pat, I just chalked this up to AWS being AWS, offering lots of services, looking at how did you take all of its capacity, add value, and make it easier for companies to do the things that are going to need to be done. As we know, there’s going to be more and more smaller models, middle-sized models, foundational models, and the large language models are increasingly commoditized. We’re seeing a world where hugging faces GitHub for AI, and that’s where this is going. There’s competition for that, but I think it’s a good use of resources.

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

Related Insights
CIO Take Smartsheet's Intelligent Work Management as a Strategic Execution Platform
December 22, 2025

CIO Take: Smartsheet’s Intelligent Work Management as a Strategic Execution Platform

Dion Hinchcliffe analyzes Smartsheet’s Intelligent Work Management announcements from a CIO lens—what’s real about agentic AI for execution at scale, what’s risky, and what to validate before standardizing....
Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth
December 22, 2025

Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth?

Keith Kirkpatrick, Research Director with Futurum, shares his insights on Zoho’s latest finance-focused releases, Zoho Spend and Zoho Billing Enterprise Edition, further underscoring Zoho’s drive to illustrate its enterprise-focused capabilities....
Micron Technology Q1 FY 2026 Sets Records; Strong Q2 Outlook
December 18, 2025

Micron Technology Q1 FY 2026 Sets Records; Strong Q2 Outlook

Futurum Research analyzes Micron’s Q1 FY 2026, focusing on AI-led demand, HBM commitments, and a pulled-forward capacity roadmap, with guidance signaling continued strength into FY 2026 amid persistent industry supply...
NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy
December 16, 2025

NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy

Nick Patience, AI Platforms Practice Lead at Futurum, shares his insights on NVIDIA's release of its Nemotron 3 family of open-source models and the acquisition of SchedMD, the developer of...
Will a Digital Adoption Platform Become a Must-Have App in 2026?
December 15, 2025

Will a DAP Become the Must-Have Software App in 2026?

Keith Kirkpatrick, Research Director with Futurum, covers WalkMe’s 2025 Analyst Day, and discusses the company’s key pillars for driving success with enterprise software in an AI- and agentic-dominated world heading...
Broadcom Q4 FY 2025 Earnings AI And Software Drive Beat
December 15, 2025

Broadcom Q4 FY 2025 Earnings: AI And Software Drive Beat

Futurum Research analyzes Broadcom’s Q4 FY 2025 results, highlighting accelerating AI semiconductor momentum, Ethernet AI switching backlog, and VMware Cloud Foundation gains, alongside system-level deliveries....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.