The Six Five team discusses NVIDIA GTC 2024.

If you are interested in watching the full episode you can check it out here.

Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.


Daniel Newman: Let’s start off and talk about GTC. Was there a bigger moment? It was the Woodstock of AI. And Pat, I got to be candid, man. Jensen was in his element in the middle of the SAP center. We couldn’t even get onto the floor.

Patrick Moorhead: It was crazy.

Daniel Newman: There was so much demand. It was a rock concert for AI lovers and he didn’t disappoint. Now, I’m going to give you two things, and there’s a lot of oxygen here. No matter how much I talk, there will be a lot of oxygen here. But, I want to give you two things to really, I came into this looking for. One is, I felt like this was the moment that NVIDIA had to secure its place as the technological leader in AI. Meaning, what is the company going to put forth that clearly says nobody’s catching us? Sure, there’s other options. Sure, there’s other SoCs being developed, ASICs. There’s other GPU players, there’s software abstractions, but we still are the Apple of AI, ironic this week to say that.

The second thing was, I was really dying to understand how the company was going to advance its customer lock-in, meaning, it’s been so clever with the developers and CUDA building that abstraction layer that basically has made it so sticky and it’s like, can they come up with what’s next? Can they come up with something else that’s going to be as sticky as CUDA has been? Especially because you’ll hear the competition talking about new compilers, you’ll hear them talking about ROCm and oneAPI and JAX and PyTorch, and you don’t need to. You can move the workloads from hardware to hardware.

Well, what if you come up with a solution that makes it even stickier there to be running everything on NVIDIA’s hardware? So, of course, I will comment briefly on Blackwell. Now, again, Blackwell is a chip, but Blackwell is part of the GB, the Grace Blackwell, and basically Blackwell is going to be a system. There’s really, nobody’s going to buy a Blackwell chip. You’re going to buy a system. It’s going to be a large system and it’s going to be, what did someone on the competitive side say? An AI mainframe. We are in the era now where we’re going to stitch all this together.

Patrick Moorhead: That’s funny.

Daniel Newman: It’s going to be connectivity, it’s going to be compute, it’s going to be GPUs, CPUs, cores. It’s going to be a link, NVLink. It’s going to be in InfiniBand. It’s going to be in a massive rack. And by the way, it’s going to be more economical to do it that way if you’re an NVIDIA shop, and it’s going to take up less real estate. And by the way, that’s really important people. Data centers, there’s limited space, limited power, and this is something that he was very prudent to be speaking to. So, long and short of it, I’ll give a couple of specs. You’ll probably give some other ones too. They’re talking about workload performance increase on the inference side with Grace by about 30 times depending on the floating point, and energy cuts by as much as 25 times. So, this was a big topic, because we’ve talked a lot about how much power hogs GPUs can be.

So, they made some big advancements on inference, which has been something that AMD had made some big strides on, and then they made some advancements on lesser power. They also basically, just to give you a relative data point, so on a previous training model of 1.8 trillion parameters, it would’ve taken 8,000 Hopper GPUs and 15 megawatts of power. NVIDIA is saying now that 2,000 GPUs Blackwell, so about a quarter of which, and it’ll do it at a quarter, about four megawatts of power. So, that was a really interesting data point. So, because we’re going time fast, I’ll just tell you one other thing I wanted to talk about was the NIMs platform. One of the things I think NVIDIA really wanted to get stickier too is going to be the on-prem data and the industry specific LLMs. He talked about a weather opportunity.

He talked a lot about healthcare LLMs, but being able to take a microservices’ architecture with a container, be able to put the libraries, the software, the hardware infrastructure, on-prem, off-prem, so cloud and hybrid architectures connected to APIs and enable a company to take like a ServiceNow architecture, combine it with NVIDIA hardware, and do so in basically a drag and drop, “More or less” container, I think that’s really interesting. And Pat, why do I think it’s so interesting? Because, they don’t yet own everybody’s prem data. But, now you take all that prem data, put it in the container and make it available for compute, for accelerated compute, that’s really sticky.

So, final thought, what I said is, did they achieve the two things? One, technological superiority? I think they did for the moment, and it’s not over. I don’t know that they’ve compelled anybody. Stock didn’t move a ton because of this. And two, this whole NIMs architecture, super powerful in terms of connecting the prem data, private LLMs that are going to become more pervasive as these big LLMs are limited to such a small number of customers. Pat?

Patrick Moorhead: Buddy, you covered a lot and there’s just so much to cover. We could have done this entire-

Daniel Newman: It could have been.

Patrick Moorhead: So, here’s what I’m going to focus on. First of all, springboarding off of NIMs. NIMs is the next new lock-in. So, 13 years ago, 15 years ago when CUDA came out, it was literally the driver set, and then that moved to developer tools, that moved to ML frameworks, that moved to models from NVIDIA that you can use to a full up enterprise stack that gets preloaded on Dell, HPE and Lenovo infrastructure. What NIMs does is it really takes that to the next level and makes it easier for application development providers like Adobe and SAP and ServiceNow to connect with data platforms like Cloudera, Snowbricks, NetApp, folks like that, and then connecting the big model builders with the AI infrastructure.

And this will make it easy for people if you’re all in on NVIDIA. You cannot, however, leverage this to an AMD, an Intel, Groq, an Untether AI. So, enterprises and partners do need to weigh the potential lock-in. But, I got to tell you, at the beginning of a boom cycle, you probably have to do this because none of the competitors are close to offering something like this but, so two edges on that. On Blackwell, absolutely amazing. It’s an absolute total beast. I want to get Signal65’s Ryan Shrout on some of the claims. The claims for energy efficiency and inference have to do with not just the chip, but an entire cluster, and that’s the comparison I want to see.

And as a reminder in comparison to both AMD and Intel and all the AI SoCs out there, this is something that doesn’t ship till the end of the year. And what’s being compared, let’s say even on the MI from AMD is what is shipping in right now. But, that’s not to take any credit away from NVIDIA at all. On the hardware platform side, I do consider NIM part of the software platform. The networking in the rack and the switch, and then connecting rack to rack is amazing to see. Once we get to Broadcom, we might have the debate on ethernet versus what NVIDIA is cooking up here, but overall, NVIDIA did nothing but gain ground. They certainly didn’t lose ground.

Daniel Newman: That’s great analysis. Like you said, I think we could do a whole show on this, had to move quickly, but I love the fact that you pointed out what’s to market today, Pat, versus what will be in the market in the future.

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.


Latest Insights:

The Futurum Group’s Guy Currier provides his insights into the advancements in the creation and operation of applications and their foundational data, along with AI, showcasing the rapid progress being made in cloud and application development.
Kubecon and the Vendors Lay Out Strategies for Driving AI
Camberley Bates, Vice President at The Futurum Group, covers the pressing issues of memory constraints and highlights from Memcon 2024.
Empowering Developers with Advanced AI Capabilities and Enhanced Data Analytics Solutions
Paul Nashawaty, Practice Lead at The Futurum Group, provides his insights on the transformative impact of Google's Data Cloud innovations and the implications for developers and enterprises navigating the evolving landscape of AI and data analytics.
Navigating the Future of AI: Analyst Perspectives on Google’s Latest Innovations and their Impact on Developers
Paul Nashawaty, Practice Lead at The Futurum Group, provides his insights into the transformative impact of Google's AI announcements at Google Next and their implications for the future of AI development and adoption.