On this episode of the Six Five Webcast – Infrastructure Matters, hosts Camberley Bates, Dion Hinchcliffe, and Keith Townsend discuss the critical intersection of AI and the semiconductor industry, while sharing highlights from Smartsheet Engage and Amazon’s Generative AI Summit.
Their discussion covers:
- Recent AI and cloud computing industry developments
- Insights from the Smartsheet ENGAGE conference and Smartsheet’s evolving business landscape
- Highlights from Amazon’s Generative AI Summit, focusing on the latest AI innovations
- Key announcements from AMD, spotlighting the company’s advancements in the AI chip market and rivalry with NVIDIA
- The overarching access and utilization challenges of AI hardware, such as data management, infrastructure needs, and the prospect of CPU-based solutions for enterprise AI demands
Watch the video below, and be sure to subscribe to our YouTube channel, so you never miss an episode.
Or listen to the audio here:
Disclaimer: Six Five Webcast Infrastructure Matters is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Transcript:
Camberley Bates: Good morning, everyone, and welcome to Infrastructure Matters number 58. We have some of the new faces. We’re usually going to see them on a regular basis because there’s four of us now here. Dion Hinchcliffe, and he is up in Seattle at the airport. Hello, early morning for you.
Dion Hinchcliffe: Yes, it is, but it’s always fun to be here. Thanks.
Camberley Bates: Of course, we have the infamous Keith Townsend. Where are you?
Keith Townsend: I’m in Seattle as well. I’m just still at the hotel.
Camberley Bates: We’re going to be talking about that. I’m stuck here in beautiful Colorado and Steve is someplace flying on a plane. So, he can’t join us today, but here we go. So, a big theme today is going to be on AI because the guys were at this thing called the Amazon Generative AI Summit and we also had AMD announcing a bunch of things. So, we’re going to be talking about AI, but the first thing we want to do is one other conference that was going on this week, which was the Smartsheet ENGAGE. Dion, you were there. So, what is Smartsheet and why should people even care about that?
Dion Hinchcliffe: Well, Smartsheet is the leading solution in what’s called the collaborative work management space. So, unlike project management, they actually help work get done and they use the sheet model. That’s what they’re called Smartsheet. It’s like a spreadsheet, but it’s collaborative and many people can use it at once. But they’re really entering a new chapter in their journey. They were public for six years on Wall Street and NASDAQ. They’re now being taken private by Blackstone and Vista Equity Partners. Now that their category is being really commoditized, everyone’s entering the space. Atlassian and ServiceNow on the IT side just added the same type of capability to their platform because it’s how things like Jira tickets actually get turned into work is you structure it in Smartsheet.
Everyone has their name assigned to it. You go out and do it, but they really are now not moving away from the sheet model, but they’re really focusing a new workspace approach that allows you to structure all your assets from out a project, including all the sheets you use for a project in one place. That was one of the bigger announcements. Of course, they’re also rolling out generative AI. That was very well received. If you haven’t tracked Smartsheet, they have a cult following. They’re extremely popular with their customers. The announcements were received, I would say, very well. The customers don’t seem to be concerned about private equity taking them off the market. I talked to their CEO Mark Mader and he says that their investment partners are aligned with their very collaborative open culture. So, we’ll see what happens. It’s going to be interesting. I don’t know. Keith, what did you see when you were there?
Keith Townsend: As you mentioned, the very cult following, they announced the collection and there was a round of roaring applause, about 4,000 people. It was really shocking to see, but a couple of notes. One, these SaaS conferences are usually attended by people who are using the product and that’s refreshing. The nature of this project are people who are working in PMOs, operation teams, disaster recovery teams, et cetera. They were really, one, into the announcements, but two, very sober about AI and how they use AI. I talked to maybe five or six customers. Each one of them had either a plan for AI or using AI, and this was outside of IT, which I thought was an extremely interesting piece. As a reformed project manager, I really appreciate the collections view because the amount of time that I spent at PWC, taking data out of sheets and et cetera, and putting that into PowerPoint just to provide a dash and executive dashboard, I can see why people are excited about it.
Camberley Bates: I haven’t dived into these, guys, at all because there’s a whole series of them that are producing this, but it’s almost like how we had Dropbox and shared files. We created a system for sharing files and now those were just Word docs or sharing the files, and now we’re sharing them in a much broader way and using more intelligence. It’s not sure if it’s AI or whatever, but more capability of collaborating. I think this is just another movement forward in this very, very hybrid, geographically dispersed world that we’re dealing with, as well as making things simpler, as you said, if I can just drop it into my PowerPoint and that stuff and not have to prepare for that. Cool. Okay.
So, let’s go into the big topic of today, which is the AI situation. We get bored about talking about AI. I think we got some big stuff to talk about today. So, I know this title, like cybersecurity, AI, what? So here we are there. Amazon held a Generative AI Summit, and I understand they brought analysts in from all over the world for this to attend, including YouTube. So, now we talk about what are the top things that came out there, and then we’re going to drop into some what are the things that are going on that are really pressing the CIOs and the IT directors and the guys that are trying to implement these things. So, Dion, you want to take that first piece, the top?
Dion Hinchcliffe: Yeah, yeah, sure. Well, I’d love to have Keith weigh in as well, but Amazon really brought us in to calibrate us for what is going to be coming out at re:Invent at the end of the year. All of us will be there, again, of course. They didn’t have a lot of new announcements, but they really wanted to give us their point of view on how businesses can be successful with AI. What did it actually take? So they had a former CIO and now a cloud evangelist to come to the stage, really walk us through step-by-step about data management, data strategy being foundational for executing on AI, how they look at it. That was the main session. The second session really focused on basically a lot of NDA topics that I can’t get into in detail, but I can give you the gist of it without violating that NDA.
Amazon really is about model choice. So, their Bedrock capability allows you to use one API to access all the models they offer and all the other public models. They have Anthropic and Meta and Cohere and all the other popular AI models. You can also put your own model in Bedrock and still use it via a single API. They also have capabilities coming that allow you to dispatch AI requests to the most cost-efficient model. They’ll still give you the results you’re looking for. That’s a hot topic. CIOs are very concerned about the cost of AI, not just from the actual bottom line cost, but also from the sustainability perspective. The AI models consume a lot of energy. We train and run them and there’s lots of concerns about that.
So, from an ESG perspective as well as a budget perspective, people want cost-efficient models and Amazon’s trying to deliver that automatically. Do you use Bedrock? They’ll be able to route the requests for the most efficient model. Of course, they’re now venturing into the hot topic where everyone really focused on, which is AI agents. So, not just AI-generating content, but AI taking action for you. So, action models is a hot topic. Amazon didn’t have many specific announcements, but they gave indications that we’re going to see things in re:Invent around that and they have to in order to stay competitive. Companies like Salesforce are using action models and agents as a competitive differentiator and Amazon’s always been playing defense on AI. It’s very interesting. Really Microsoft and Google have taken the lead in the conversation and I think this summer was an attempt to get out and front of that conversation and start following that topic, at least in the analyst community.
Camberley Bates: Yeah, let me comment on your agent thing.
Dion Hinchcliffe: Yeah, sure.
Camberley Bates: Then I’ll pass over to Keith. The agent idea is an interesting idea because I was on a conference, CTERA, which is a global file system. It’s where people will share data all over the world and be able to rapidly bring it back, heavily focused on the government institutions. Things are locked down. That’s a risky area to be in. But one of the things they rolled out with on that is they have an AI initiative, so the data intelligence piece of it, but they also rolled out with agents or assistant.
Dion Hinchcliffe: Interesting.
Camberley Bates: They have assistant personalities there. They’ve got a lawyer personality, they’ve got a document personality, they have a marketing personality, they have a customer service personality. So, I guess what they’re doing is they’re creating ways or training these personalities in certain ways to pull the data through or something like that. Is that what’s going on with these things? Is that what these systems are?
Dion Hinchcliffe: They are action models, which is the largest language model you use for agents. They actually work on the APIs for you. So, they’ll call all the APIs to get the work done. So, if you want to create a marketing campaign, it’ll write the campaign for you, go to all your tools, create the campaign, and then send it out. I mean it’s amazing. So, these agents actually take action directly on the IT systems that you make the requests for.
Camberley Bates: Writing requests.
Dion Hinchcliffe: Yeah, the security and trust implications are enormous. I think it’s going to be very difficult for organizations to really sort through all the issues, but the promise is enormous. The opportunity is great. So, yeah, I’m super excited about them, but I have a lot of concerns, so we’ll see how it goes.
Camberley Bates: Well, I think it’s like anything else that’s automated. Initially, it presents you with the information and as an expert go through and look at the expert and say, “That works, that works, that works. That doesn’t work. Throw it away,” and train it and then we go and then initially get to the point that we can let it automated, but maybe it just streamlines the process.
Dion Hinchcliffe: This way, Amazon talked about the humans in the loop will be key to make agents successful early on. Of course, eventually, you can get humans out of the loop once you trust it. Yeah, exactly right.
Camberley Bates: Okay, so Keith, we’ve dominated the conversation. Onto you about what’s happened over there at Amazon?
Keith Townsend: Yeah, so I spent some time with their accelerated compute team, their compute team, the storage team, as well as some of the Amazon queue team. One of the things that was clear is that Amazon has given a lot of thought about optionality when it comes to accelerated compute and AI. Seeing for the first time in person Inferentia and Graviton4, they had both chips out for display for us to take pictures of. It wasn’t even in the NDA area. I was surprised. People don’t even know how unusual that is. You cannot buy those chips and I’ve never seen them before. Even the Amazon people there said they’d never seen them before. Only in the data center.
Dion Hinchcliffe: That was pretty cool. But one of the things that I really pushed back on them about and around, not just optionality, is this idea that Amazon is being threatened by the private data center. Again, over in the EU, the EU Competition Committee, Amazon announced that or at least reported that data center has put pressures on their business and they’re seeing competition. Is that the stop from getting sued by the EU for a monopoly or is it real? I think our data is showing that it is real that customers are considering private options over the public cloud. So, I was asking a lot about lock-in. When we think about these chips and when we think about what AMD, Intel, and NVIDIA are trying to do at the software level to lock customers into software platforms that’s tied to the hardware, Amazon is probably the poster child for walled garden and these solutions are extremely, extremely powerful and addictive.
So, when I’m all in in Bedrock, what’s my options for basically leaving? Amazon, they were pretty forward in saying they want to make it so that you don’t want to leave. From a cost performance perspective, they expect to compete with any solution and that they would keep pace of change and allow you to modernize in a way that’s hopefully much easier than what we’re seeing on the traditional ISI and move to platforms and processors and accelerators that fit your needs, whether you’re talking about inferencing or a lot about training on Inferentia.
Camberley Bates: So let’s get to this other piece of it. I mean AMD had their announcements this week. They had their big conference. Patrick and Daniel were both down there. So, there’ll be a lot of publication coming out on the very details, but Keith, you were tracking what they were rolling out. So, maybe you can touch base on that because then what we want to drill into is this issue that, Dion, you raised, which is NVIDIA has dominated the market and now we’re starting to see other companies bringing out other pieces, et cetera. What are the options for getting hold of Blackwell? Is there really going to be competition here and how does that competition look in terms of the chips and such? So maybe Keith, would you go through, talk a little bit about what you saw from the AMD release and what was going on?
Keith Townsend: Yeah, so I was following mainly Patrick and Daniel on X and trying to keep pace with some of the releases. Some of the highlights, I think we can dive deep into it maybe next week once we actually read the release and get some more detail and talk to Daniel and get the debrief. But a couple of really interesting pieces that will help Dion in his conversation. Meta is blowing 1.5 million EPYC CPUs in support of their MI300X GPU accelerator or AI accelerator that is exclusively used. Llama 405B, their 405 billion parameter model exclusively runs on that. So, that’s an interesting step for Dion to chew on a little bit later. Then the other big announcement coming out was that they have announced a new DPU and they’re all in. This isn’t a surprise for those of us that have watched the company. They bought a DPU company a few years ago. I’m surprised that it’s taken this long to come out with an actual DPU. But this is in competition with… There’s a consortium. The Ultra Ethernet Consortium has come together to basically battle NVIDIA and this ability to use ethernet over somewhat proprietary solution that NVIDIA used for high-speed networking between GPUs. I think there’s going to be a lot to watch, especially for ultra data center geeks around ultra ethernet.
Camberley Bates: We’re seeing the battle coming out with that with a whole bunch of those elements. We can also probably see eventually the Broadcom bring some of those pieces out as the battle continues for the ethernet connections for the AI environment. I know there’s still more talking about whether or not InfiniBand is going to exist in that, but probably not. It’s probably going to go because we’re doing the east-west design of these architectures as the north-south. They lend themselves more to the ethernet technology from what I understand. So, one of the things we talked about is, and you mentioned, Dion, at the market estimate coming out of AMD for AI chips was $1 trillion by 2030. Hopefully, God, that’s not all going to be it. Maybe it would be. We have a lot of other people rushing to bring these things to bear into market. But we’ve been in a supply issue because we’ve got one supplier pretty much, but we’re also seeing some other people come out. You talk about what are you looking at? How are you advising clients? What if you can’t get your hands on the Blackwell?
Dion Hinchcliffe: So for those who aren’t tracking, Blackwell is a next generation AI processor from NVIDIA. It’s highly anticipated. There have been production problems and they’ve had to redo the mask lately. Those are not good signs, but it seems like they’ve managed to get through that and are now shipping because Microsoft just announced they had the first Blackwell chips in the public cloud this week. So, that was major news. But the reality is NVIDIA has got 90% of the GPU market around AI chips. The reason that is because they own the entire stack. They have an API called CUDA, which is a developer darling. Everyone likes it. Everyone’s built their model around it and it’s very hard. It’s the lock-in for AI and it’s very hard to move your code over to an entirely different stack. So, if you want to go with Intel chips, AMD, you want to go with Amazon Inferentia and Tranium, which are two chips, one’s for inference and one’s for training AI models. They use entirely different APIs. So, there’s a lot of reticence around moving your AI code if you’re building foundation models and moving away from the proven very popular CUDA API for your AI because we don’t know how well Intel and AMD…
Even the Summit and Amazon, they probably had the most, I think, the most notable success in the market other than NVIDIA entering the actual uptake. But you can’t buy the chips. You can only use it do Amazon services. So, it’s very interesting. So, what do you do? You have to be contrarian and you have to be willing to bet against the market if you want to use a different AI chips. So, Blackwells are going to be constrained and whatever is after it. We constrain for the foreseeable future if these forecasts are correct and we believe they are. So, $1 trillion annual spend on AI chips is where the market is headed. That means that NVIDIA is going to sell everything they have and more and they’re going to be very constrained for the rest of the decade if this pace continues and all satisfied that it will.
So, there are alternatives. Right now, I was heading do my best. I tell organizations don’t get yourself locked in CUDA. Other chips are out there. Cerebris has the world’s largest AI chip, it’s an entire wafer. It’s amazing. It’s incredible. It’s actually putting Moore’s law back on track, but it uses its own API too. So, the key is, I think, if you want to be in AI long term, don’t get locked into a particular vendor. Start hedging your bets, build an independence layer between your model and the API that you’re using. That will put you in good stead because NVIDIA is not likely to remain the only option given how constrained they’re going to be in terms of popularity. So, that’s my take on it. I don’t know. Keith, that’d be very interesting to hear what you say. The same with you.
Keith Townsend: Yeah, so this is where middleware has a huge opportunity, this ability to sit in between the AI chip, low-level programming languages like CUDA, Intel, AMD, a little bit more open and more progressive in their licensing around those languages or APIs. So, if you’ve messed around with AI at all, whether you’re talking about within one of the cloud providers or with one of the AI studio platforms like PyTorch, you don’t see CUDA, you don’t see the PyTorch does that for you, that translation for you. So, you can go from one model to the next, from one GPU iteration to another. If you’re developing at that AI level, at that PyTorch level, if you’re doing those types of projects, those are the tools that’s really impacted. Where are you going to see the difference?
You’re going to see the difference in performance. You can run the same models across different GPUs and see a difference in performance. A big question is what are you going to use your AI for? Are you training models? If you’re training models, you’re probably not taking advice from the three of us. If you’re doing AI infrastructure, which is where the vast majority of the industry is going to go, I wouldn’t be fixated on which GPU, which accelerator to use. You’re going to have plenty of options. Don’t wait on NVIDIA or the availability the NVIDIA if you see a market opportunity that’s going to accelerate you.
Camberley Bates: We were talking about how do you get your hands on fill-in-the-blank GPU situation. Since I work in the data world, one of the problems is that your data is here and the GPU is here, which is one of the reasons why earlier talking about whether or not we’re going to go on on-prem or in the cloud. It’s because data has gravity and we don’t particularly like to move it and it can get very costly if you then take it out of the cloud on the back side. So, some of the implementations that we’re seeing from the data storage people is enabling through the snapshots or through the mirroring or through whatever it is. It’s constantly keeping updated. Maybe a system that sits right next to the Amazon environment or the Microsoft environment, but the data is actually sitting in the Equinix colo, because you hold onto your data then. I don’t lose my data, but I’m able to have that speed of connection back to the processors that are over there.
So, that is one of the strategic things that people are looking at is they’re looking at privacy and how do you access your primary data center in someplace else and where the transaction is going. So, I look at other strategies and technologies to get it over there. We’ll see if that strategy works with any of the guys and where they’re going. Keith, what I really liked what you said is that this issue of, “Is a single GPU really driving all the decisions about what you’re implementing and how you’re implementing?” The issues on training, the issues on rag, having data ready to feed into the inference engine, et cetera, is a huge issue and probably has it almost a stickier problem than getting a hold of the GPU systems. So, we have a rack of problems here. Stop there for any comments.
Dion Hinchcliffe: Well, I think it’s just interesting. Organizations will want to get their hands on their own GPUs if they want to run their models locally. There’s, of course, a lot of reason to do that is controlling and protecting your data and ensure privacy taking and preserving PII. It’s a first identical information. Most organizations, especially large companies, don’t want to release this information into the public AI models, not knowing for sure what’s going to happen to them. There can be accidents and data spills. They’d rather not have the information get out there into the AI ecosystem where they can control it. So, that’s driving a lot of the enterprise demand is saying we want AI but for many of our most important AI tasks and we are not going to use public models when we can use private models of our own data and are running our own inference on our own models. That’s going to drive, I think, a lot of the configure how you’re going to make it cost-effective.
Keith Townsend: We’re seeing AI models being shrunken down to run on a phone. It really does matter what you’re trying to accomplish. We’re fixated as the industry on GPUs and accelerators that are separate from the CPU, but Intel, AMD, the cloud providers have really done a really good job of putting accelerators built into the instruction set of the CPU itself, especially if we’re looking for a model that’s probably going to be closer to what we’re going to implement in enterprise and most use cases for most agents around seven billion parameters. Depending on the scale of your projects, CPUs might just be fine.
Camberley Bates: This is where I’m going with it, is as we have inference engines that are doing real time ingestion of data, real time rag information that goes in there that has to generate a response based upon a user hitting the website or asking a query, et cetera, as opposed to more of a recommendation engine or something like that line, yeah, the speed matters, but is there a point somewhere along the speed of the GPU is enough by the enterprise? I mean is that a possibility?
Keith Townsend: Absolutely. This is where tokens per second matter, this is easy math. The average human, I think, reads about 33 words per minute. That is much slower than a CPU can serve up a seven billion parameter model. You can probably get about 15 users per second in an average CPU for the average model. So, unless you’re talking about massive scale of AI, there is a point where enough GPU matters. We’re fixated on Bluefield and a bunch of newer GPUs, but the L-4, the L-40s, these lower-level GPUs from NVIDIA are much less capable than a H-200 or Blackwell chip. They’re perfectly fine positioned to the enterprise. So, Gaudi from Intel, 300 series from AMD, Grok has come up with some inference-focused AI chips. We’re going to see plenty of options for the enterprise, especially when it comes to emphasis.
Camberley Bates: Great. Well, thank you very much, gentlemen, for joining me this morning for Infrastructure Matters. Thank you all for listening in. We appreciate it. Make sure to click, follow, share, do all those things, and we will see you next week.
Dion Hinchcliffe: Take care, everyone.
Author Information
Camberley brings over 25 years of executive experience leading sales and marketing teams at Fortune 500 firms. Before joining The Futurum Group, she led the Evaluator Group, an information technology analyst firm as Managing Director.
Her career has spanned all elements of sales and marketing including a 360-degree view of addressing challenges and delivering solutions was achieved from crossing the boundary of sales and channel engagement with large enterprise vendors and her own 100-person IT services firm.
Camberley has provided Global 250 startups with go-to-market strategies, creating a new market category “MAID” as Vice President of Marketing at COPAN and led a worldwide marketing team including channels as a VP at VERITAS. At GE Access, a $2B distribution company, she served as VP of a new division and succeeded in growing the company from $14 to $500 million and built a successful 100-person IT services firm. Camberley began her career at IBM in sales and management.
She holds a Bachelor of Science in International Business from California State University – Long Beach and executive certificates from Wellesley and Wharton School of Business.
Keith Townsend is a technology management consultant with more than 20 years of related experience in designing, implementing, and managing data center technologies. His areas of expertise include virtualization, networking, and storage solutions for Fortune 500 organizations. He holds a BA in computing and an MS in information technology from DePaul University. He is the President of the CTO Advisor, part of The Futurum Group.
Dion Hinchcliffe is a distinguished thought leader, IT expert, and enterprise architect, celebrated for his strategic advisory with Fortune 500 and Global 2000 companies. With over 25 years of experience, Dion works with the leadership teams of top enterprises, as well as leading tech companies, in bridging the gap between business and technology, focusing on enterprise AI, IT management, cloud computing, and digital business. He is a sought-after keynote speaker, industry analyst, and author, known for his insightful and in-depth contributions to digital strategy, IT topics, and digital transformation. Dion’s influence is particularly notable in the CIO community, where he engages actively with CIO roundtables and has been ranked numerous times as one of the top global influencers of Chief Information Officers. He also serves as an executive fellow at the SDA Bocconi Center for Digital Strategies.