AMD and Solidigm Power the Open AI Ecosystem – Six Five On the Road

AMD and Solidigm Power the Open AI Ecosystem - Six Five On the Road

On this episode of the Six Five On The Road, hosts Keith Townsend and Dave Nicholson are joined by AMD‘s Robert Hormuth and Mark Orthodoxou for a conversation on how AMD and Solidigm are collaborating to power the open AI ecosystem.

Their discussion covers:

  • Innovations in data center solutions and Instinct Data Center GPUs by AMD.
  • The evolving landscape of AI technologies and their applications.
  • The role of AMD and Solidigm in fostering an open AI ecosystem.
  • Challenges and opportunities in the current AI market.

Learn more at AMD. Also, discover more about this collaboration at Solidigm.

Watch the video below, and be sure to subscribe to our YouTube channel, so you never miss an episode.

Or listen to the audio here:

Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.

Transcript:

Keith Townsend: Welcome back to Six Five On the Road with my co-host, Dave Nicholson. Dave, it’s been a bunch of great conversations today around the Solidigm ecosystem. And no better guest than AMD, Robert, Mark. Great to meet you folks. And Robert, we are long time friends.

Robert Hormuth: Yes.

Keith Townsend: We go back to the Tech Field Day community, you bought me some pretty cool graphics cards. No, no, no, no. I’m going to call them AI chips. You bought me some AI-

Robert Hormuth: I even took you to dinner last time I was in Vegas.

Keith Townsend: You even took me to dinner last time we were in Vegas, but I’d be much more appreciative if you gave me a Genoma chip. That would be like-

Robert Hormuth: We make a Genoa.

Keith Townsend: Genoa. I’m sorry. Genoa chip.

Robert Hormuth: Or Genoma. You know, I happen to have one.

Keith Townsend: Oh!

Robert Hormuth: It’s not for sale, but this is our epic Genoa 96 core.

Keith Townsend: 96 core.

Robert Hormuth: I’ll let you, you can touch and feel it, but I’m going to keep an eye on you.

Dave Nicholson: Do you make a stainless steel version that somebody like me could afford?

Keith Townsend: I’m going to have Dave count the cores.

Robert Hormuth: So that one uses our advanced 2D packaging technology, or it’s a 2D packaging technology between an I/O die in the center and then all the outside dies are the cache core complexes.

Dave Nicholson: So walk us through the nomenclature, if you don’t mind. Epic.

Robert Hormuth: Epic is our CPU brand.

Dave Nicholson: Yeah.

Robert Hormuth: Genoa was the code name, not the branding. I think it’s the 9,004 or 9,000 series is Genoa. The actual product name is the 9,000 Family.

Dave Nicholson: And this is rolled out, this is like your latest iteration?

Robert Hormuth: That one’s close to-

Dave Nicholson: It’s been in the market for a while.

Robert Hormuth: It’s about a year and a half to market. We followed that with another version called Bergamo, which was our dense core. So we took the same kind of core, which this is our high performance core. We did a dense core where we’re able to squeeze it down tighter and make some physical design trade-offs on cache size and stuff. But it’s ISO compatible, instruction set compatible. And we’re a pack 128 onto this same package, and so it drops into the same platform and that one’s optimized for cloud native. This is more your general purpose compute versus the cloud native.

Dave Nicholson: Got it.

Keith Townsend: So give us some of the hero numbers. How much RAM? What can I do with 96 cores?

Robert Hormuth: You can do a lot. Right now, if you kind of think about it just in terms of world… Let’s just put it in terms of some numbers. In terms of like SPECint, SPECpower, any kind of database benchmark between Genoa and Bergamo. They have the number one on about 300 plus benchmarks that are standardized around that the world uses. So those are your two premium CPUs on the market today. We have about a 1.6, 1.7X performance advantage over the comp and about a 2X perf per watt advantage, which comes into play mightily in the era of AI. The performance per watt energy efficiency is key. And I have another toy.

Dave Nicholson: What is this?

Robert Hormuth: Maybe I’ll let Mark talk about this one. This is actually the 300X. So this is the bearer 300X without the… The heat sink’s kind of tall, but this is our flagship GPU that we launched in December that has been shipping and starting to be available in cloud instances today. So I’ll let you look and maybe Mark, you want to talk about the 300X?

Keith Townsend: It was pretty smart that you gave it to me first and then you have him talk about it, so he needs to hold it. I saw what you did. I saw, I understand the strategy there.

Mark Orthodoxou: Yeah, this is a pretty powerful piece of technology you’re holding in your hand there. And most people haven’t seen it without the heat sink on it, so it’s kind of a rare treat. This is a little different than Genoa in that it uses what we call 3.5D packaging technology. So it has the 2D die side by side on an interposer technology, but also does 3D stacking to get all that technology onto a single package like that. So there’s 12 chips in there and eight stacks of HBM three memory. So that has 304 compute cores, independent discrete compute core instances. It has 192 gigabytes of HBM three. It delivers a ton of memory bandwidth to that HBM three. It has almost a terabyte per second worth of bandwidth that comes on and off that GPU to connect to other GPUs. It’s a pretty remarkable piece of technology.

Keith Townsend: How would OEMs package something like this? What would be the end product?

Mark Orthodoxou: Yeah, great question. I mean, what you’re holding there doesn’t actually sell as a unit. Even with the heat sink, we sell eight of them. In a standards compliant form factor, we call it a UBB eight. It’s OCP standards compliant-

Dave Nicholson: At Costco?

Mark Orthodoxou: At Costco, yeah. Go grab it off Amazon.

Dave Nicholson: You can’t buy one? Got to buy eight.

Mark Orthodoxou: Grab it off Amazon if you can’t get to Costco.

Keith Townsend: Got to buy them in bulk.

Mark Orthodoxou: Yeah, so it ships in eight instances at a time, all networked together, mesh connected with what we call infinity fabric, which is a load store coherent, low latency, high bandwidth fabric that allows all the GPUs to communicate. And it slides into existing infrastructure though that folks are familiar with from OCP compliant servers. So one of the focuses we had with MI300X is to make sure that it was frictionless to adopt from the hardware standpoint, as I mentioned, but also from the software standpoint.

Keith Townsend: Well, I was just about to ask you about the software standpoint. I think there’s a lot of just uncertainty around software. When we hear about the dominant player in the space, one of the motes has been, or at least the perception of the mote has been, software. So help us think through, kind of, take us from the model that we might pull down from Hugging Face and how do we get it onto a 300?

Mark Orthodoxou: Well, to the perception, as I mentioned, the goal is frictionless portability of the software. And I think there is a bit of a perception issue here because this is not a first generation GPU, first of all, from AMD, right? We’ve been doing this for five plus years. We’ve been working on Rockum, which is the foundational software that drives Instinct for more than that, longer than that. And the strategy has always been functional compliance all the way down to the library level if you’re doing very manual tuning on this GPU. But it has also been ease of adoption by ensuring that we have support for AI frameworks, which is commonly actually today how AI programmers program. They don’t program in CUDA.

CUDA is often used as a reference when we’re talking about the competition’s overall software ecosystem. The reality is, if you talk to an average enterprise CSPs, they’re mostly programming at an abstraction level that is entirely portable over to AMD. Literally no code changes. So frameworks like Python, like TensorFlow, like Jax, 100% portable with no code changes. And so if you’re downloading a model off Hugging Face, it just works. In fact, right now, as of today, I believe there’s greater than 600,000 models on Hugging Face that are fully functionally portable as is over to Instinct.

Keith Townsend: So let’s again talk about availability, getting access to this stuff. You made a recent announcement with Microsoft. How do people who just need to dip in and out of whether it’s training or some really large inference job, how do they get access to this without needing to order and have one of these come on site?

Mark Orthodoxou: Yeah, that’s a great question. There’s a number of mechanisms. Obviously as you pointed out, Microsoft just announced that their Azure instances are now available on MI300X. Obviously there’s CSP partners that are selling these capabilities for remote access and in that manner. We also, within AMD, have an AMD accelerator cloud. So we do allow for remote POC testing on our own infrastructure. That’s based on a prioritization that we help influence based on our strategic customer set that’s asking for access. But we also have a number of OEM partners that are launching products. Many of them have been announced already, and they also have their own remote access capabilities in many instances. So you don’t necessarily need to buy one. You can contact any number of different partners, including some of the major CSPs, and you can gain access in that way.

Robert Hormuth: Yeah. I mean, Dell has a solution center where you can remote into and do POC work historically. I mean, one of the things that… The excitement around this is profound because it’s such a disruption in the market and bringing choice and competition. But a lot of people ask, “Well, what’s the intrinsic differentiator?” And I try and sum it up really simple. Because we went after really disrupting the amount of memory capacity and memory bandwidth. This has 192 gigabytes of memory. The competitors that currently volume shipping has 80. So what they need, if you’re doing something that requires all eight GPUs and all the parameters of all the eight times 80, you can do it in three and a half of these.

Dave Nicholson: And when you’re referring to memory, that would be high bandwidth memory?

Robert Hormuth: That’s on this.

Dave Nicholson: On it.

Robert Hormuth: Yeah. That’s these eight outside chips are the HBM stacks.

Dave Nicholson: Which, as I understand from our semiconductor friends, means that you have less data traversing from one place to another over time, which means less power consumption.

Robert Hormuth: Absolutely.

Dave Nicholson: And on the subject of power consumption, if you kind of stick with me on the little analogy here, these two beasts that we have on the table need to be fed at least two things. One of them is power, of course. The other is data. And if you look at the requirements from a power and data perspective, our friends at Solidigm, we’ve been talking to a lot of folks in the ecosystem about the benefits of density derived through using QLC technology. You’ve got the data there, it’s solid state, so it’s not sucking in the power that mechanical devices suck in. I have seen the completely packaged MI300Xs with the cooling towers. How much power do each of these consume roughly? Do you know off the top of your head?

Mark Orthodoxou: About 750 watts.

Dave Nicholson: About 750 watts.

Robert Hormuth: This is about 750 and this is probably… At the Genoa, this is around 380, 400 today.

Dave Nicholson: You start getting, because a lot of times these numbers are, we talk about a billion parameters, a gazillion parameters, and we all nod our heads like we know what that means. Everybody kind of knows what 750 watts is. It’s not quite, but almost enough to power a hairdryer, right? Sorry my friend. But seriously, eight of these together, you start talking about significant power consumption, the power savings associated with having that memory resident instead of separate. And what solid state devices that are very dense do for that whole equation enables this kind of open ecosystem. But what are the other challenges and considerations when you’re engineering these things that you’ve got to deal with?

Robert Hormuth: Yeah, I mean, I think if we start at a high level of the challenges and work our way down to the components, if you think of the state of the industry today of where we’re at in terms of power, data center capacity, there’s a power challenge going on. I’ve seen estimates that with the rise of AI over the next couple of years, we’re either short 80 gigawatts of power or we’re short 300 gigawatts of power. That’s a lot of power to be short with the rise of AI. And there’s a couple of things going on. If you look at the vacancy rates in data centers today, it’s less than 2%. It’s at a historic all-time low worldwide. It’s like 1.7. So there’s no room at the end. And if you look at the new construction, which this is even more profound, the new data centers that end of construction, 84% of them are already leased and they’re not done yet.

So we have this problem with the rise of AI. Every business is driving towards, “I’m going to go do AI to disrupt my business, but now how do I go do it?” And this is where the combination of Solidigm and Epic, we’ll park this one on the side for a minute and talk about him in a minute, but we have to go create space and capacity in the existing data centers. People have to drive that consolidation to make space and capacity. And so this is one of the best tools in the industry right now to go drive that. If you have 5-year-old servers, five, 6-year-old servers, you can do five or eight to one kind of consolidation. And so huge consolidation, which frees up that space power. And then you bring in the Solidigm NVME low power that helps shrink that. Getting rid of old rotating five-year-old drives, shrink it all down.

Now we’re freeing up real space and capacity and power so that we can power Mark’s goodie over here. Which we need to because basically, if you kind of think about it at a high level, if you take out maybe five or eight… If you kind of take five or eight or about eight servers out today and consolidate to one, you’ve basically created enough power capacity to go put in one AI server at a high level.

Keith Townsend: Let’s put some higher level abstraction to this to make this real. So I have a couple of servers with this chip, a bunch of RAM, we haven’t talked about the limits of RAM, the RAM limits at all yet, and we’ll get those numbers from you. But in theory and in practice, I can replace a rack of ten 2U servers with two 2U servers with these on it running about 1,000 or more VMs. That is some density. And now we’re talking about, so the next question goes to, well, what about local disc, et cetera? Again-

Dave Nicholson: By the way, that dynamic by itself, forget about AI, was actually killing the market for data centers. Because there was a period of time where guys like Keith and I would look at that and we’d see this massive consolidation and it’s like, people are needing less data center space. Who wants to invest in building data centers? Nobody except hyper scale clouds.

Robert Hormuth: But Keith, you’re dead on. I mean, when we consolidate down eight servers that had 12 or 24 drives per server, we’re going to move all those workloads and all that storage, just got to fit in the new box. Which means we need higher performance, more efficient, denser drive technology like Solidigm is driving in the market. To keep pace with the cores that we’re pushing, we need the storage industry and the ecosystem to keep up. And they’re doing a great job.

Keith Townsend: You also need the I/O performance because in virtualization, as we know I/O, just like in AI, I/O is almost always the bottleneck. You’re not going to get that type of I/O out of spinning rust. And if you want to consolidate and you want to keep your power within the envelopes that you need within these data centers that you cannot rent right now, I think I looked at like data centers are… Data centers these size take a year to three years to build, and they are sold out. I have data center space from the CTO advisor data center. I’m not letting it go. It is going to be a hot commodity one day, so I’ll resell that. But I did want to get, I have to, I like numbers. I like practicalness too, but RAM, how much RAM can I get into these? And we understand the graphics card capability. What about traditional CPU? How much can we get into this?

Robert Hormuth: So on Genoa, we support 12 channels per socket. So 24 in a 2U server. Two socket 24 DEM. So use the 96 gig or 128. We’re pushing two to four terabytes quite easily.

Keith Townsend: And again, so you’re looking at a two node system with up to, what? 24? Was it 12 terabytes?

Robert Hormuth: No, we’re at a per socket-

Dave Nicholson: 24 DEMs, you’re saying.

Robert Hormuth: 24 DEMs at 96 or times 128, let’s just say two to four terabytes.

Keith Townsend: So two to four terabytes per-

Robert Hormuth: Per server.

Keith Townsend: So you’re looking at up to eight terabytes of RAM in two servers.

Robert Hormuth: Mm-hm.

Keith Townsend: The limitation around virtualization is almost always RAM. So even if the CPU cores are not being put to work, the RAM will be. And again, we see why we get this consolidation. Robert, Mark, we really appreciate you two coming down. I really appreciate you donating to the cause.

Dave Nicholson: Yes, we appreciate that.

Keith Townsend: Leaving the 300 and the Epic. We will put them to good work.

Mark Orthodoxou: Very expensive coffee coaster.

Dave Nicholson: We’re going to co-parent.

Keith Townsend: Yes, we won’t argue.

Robert Hormuth: You won’t argue.

Keith Townsend: Thank you for me and my co-host, David Nicholson. This is always fun, especially when guests bring props. Stay tuned for more coverage from The Six Five On the Road. Solidigm, thank you again for bringing in awesome guests. Stay tuned.

Author Information

David Nicholson is Chief Research Officer at The Futurum Group, a host and contributor for Six Five Media, and an Instructor and Success Coach at Wharton’s CTO and Digital Transformation academies, out of the University of Pennsylvania’s Wharton School of Business’s Arresty Institute for Executive Education.

David interprets the world of Information Technology from the perspective of a Chief Technology Officer mindset, answering the question, “How is the latest technology best leveraged in service of an organization’s mission?” This is the subject of much of his advisory work with clients, as well as his academic focus.

Prior to joining The Futurum Group, David held technical leadership positions at EMC, Oracle, and Dell. He is also the founder of DNA Consulting, providing actionable insights to a wide variety of clients seeking to better understand the intersection of technology and business.

Keith Townsend is a technology management consultant with more than 20 years of related experience in designing, implementing, and managing data center technologies. His areas of expertise include virtualization, networking, and storage solutions for Fortune 500 organizations. He holds a BA in computing and an MS in information technology from DePaul University. He is the President of the CTO Advisor, part of The Futurum Group.

SHARE:

Latest Insights:

Commvault Addresses the Rise of Identity-Based Attacks With Automated Active Directory Recovery, and the Ability to Protect Active Directory Alongside Entra ID
Krista Case, Research Director at The Futurum Group, shares her insights on Commvault’s automated recovery of Active Directory forests.
Marvell Spotlights How Incorporation of Its CPO Technology Capabilities Can Accelerate XPU Architecture Innovation
Futurum’s Ron Westfall explores how Marvell’s CPO portfolio can play an integral role in further demystifying applying customization in the XPU architecture design process, incentivizing hyperscalers to develop custom XPUs that increase the density and performance of their AI servers.
Dr. Howard Rubin, CEO at Rubin Worldwide, joins Greg Lotko and Daniel Newman to reveal how strategic technology investments drive superior economic results.
On this episode of The Six Five Webcast, hosts Patrick Moorhead and Daniel Newman discuss Meta, Qualcomm, Nvidia and more.

Thank you, we received your request, a member of our team will be in contact with you.