Breaking Barriers: The Intersection of AI and Storage Technologies

Breaking Barriers: The Intersection of AI and Storage Technologies

On this episode of the Futurum Tech Webcast – Interview Series, host Daniel Newman is joined by Hammerspace’s David Flynn, Founder and CEO, for a conversation on the cutting edge of AI and storage technologies.

Our discussion covers:

  • The background of David Flynn’s role and the revolutionary technology developed at Fusion-IO
  • The major challenges Enterprise organizations face with their Infrastructure as they embark on AI projects
  • Insights Enterprises can glean from Hyperscalers for their AI architectures
  • The significance of open source in the realm of AI
  • The critical roles of Hyperscale NAS and Data Orchestration within AI architectures

To learn more, click here to read the Hyperscale NAS Technology Overview on Hammerspace’s website.

Watch the video below, and be sure to subscribe to our YouTube channel, so you never miss an episode.

Or listen to the audio here:

Or grab the audio on your streaming platform of choice here:

Disclaimer: The Futurum Tech Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.

Transcript:

Daniel Newman: Hey, everyone, welcome back to another episode of the Futurum Tech Podcast. I’m Daniel Newman, your host, the CEO of The Futurum Group. Excited for this Futurum Tech Podcast interview series. I’m going to be talking to David Flynn. David is the CEO and founder of Hammerspace. A very exciting company and you’re going to learn a little bit more about it today. First timer on the show, definitely an up-and-coming company, one that we at The Futurum Group have paid a lot of attention to, and I’m really excited to have him here with me today.

Without further ado, David, welcome to The Futurum Tech Podcast. Thanks so much for joining us.

David Flynn: Thank you, Daniel. It’s a real pleasure to be on the podcast for the very first time.

Daniel Newman: Yeah. It is great to have you here. I’ve got to tell you, as someone that really does pan technology, I’m hearing more and more about Hammerspace. Not only because our teams have been working together on a number of different engagements from advisory to content that we’ve co-developed, but also just because I’m tuned in to what’s going on in the space. I was at GTC and I heard the Hammerspace name come up a number-

David Flynn: Well, I have to ask, had you heard of Hammerspace before the context of the company?

Daniel Newman: I hadn’t, no.

David Flynn: No. Yeah, Hammerspace is actually a real thing. Well, a real imaginary thing. Like hyperspace is a sci-fi concept of an extra-dimensional place you travel in, hammerspace is the place where things come from when you pull them out of thin air. It’s the pocket universe, the magician’s bag. It started as a tongue-in-cheek thing for these Japanese cartoons where characters would pull big hammers out of thin air to whack the other character with. The joke was, “Where did that come from? Where did he have that?”

Daniel Newman: In Hammerspace.

David Flynn: It came from Hammerspace. So hammerspace is magic. In the most recent Spiderman, Into the Spiderverse, that was actually one of the superpowers. It was Spiderham’s superpower that he could access things out of hammerspace and they talked about it extensively. It was great advertising. The name is starting to get out there.

Daniel Newman: It is getting out there. I got to give you some credit, I’ve heard more about it from what you’ve built than any of these other … I don’t watch a lot of cartoons. I do admittedly like Family Guy. It’s complicated, my relationship, but I find the humor to be quite dynamic, diverse, and cultural in its references. A lot of people that never-

David Flynn: I don’t want to say anything about South Park then.

Daniel Newman: Oh, South Park’s good.

David Flynn: I love South Park. I love South Park.

Daniel Newman: I have not had as much time with it, but I think from the same vein, from my experience, of cultural references, a little bit of pulling things out of thin air and rejoining them to the world, and then of course making light and sarcastic humor of everything, and the entire human condition. Look, cartoons aside and hammerspace aside in the way you referenced it, let’s talk about the Hammerspace that you’re building. But first and foremost, you founded the company, you’re leading the company.

David Flynn: Yeah.

Daniel Newman: Give us a little bit of the background on the company, your role, and also talk a little bit about the technology developed at Fusion-io, which is another thing that you did.

David Flynn: These things dovetail together. I had the privilege of being at the forefront of the solid state storage revolution, the move from mechanical hard drives in the data center to SSDs, solid state disks. We, at Fusion-io, and I founded that company and was the CEO, we did phenomenally well. First revenue to nearly a quarter-billion a year in less than three years.

Daniel Newman: Wow.

David Flynn: From first investment at a $30 million valuation to a three plus billion dollar public company on the New York Stock Exchange in those same three years. Which by the way, coincided with 2008 and 2010, the downturn, so right through that. It was a hockey stick like you’d never seen before and it’s because there was this pent-up need in the data center for high performance. Flash had become a commodity in consumer electronics but had not made its way back into, to disrupt the more traditional data center stuff.

That success is what queued things up for Hammerspace because, number one, what was really needed is a way to have this new high performance media actually be able to deliver data and data at a global scale. I set out, after Fusion-io, I set out to build the technology to make it possible to deliver data at unprecedented levels of performance and from a physically distributed infrastructure, like flash inside of servers, or across existing storage systems, or across entire data centers. Those are all three different scale factors for the same challenge. That’s where the idea came from, was just being right at the forefront of the challenge of high performance and high agility with data.

Daniel Newman: Yeah. First of all, congratulations on what you accomplished. I was still pretty young during 2008, but I do remember that was a tumultuous period of time. Building anything for growth during that era, when it looked like everything was going to capitulate on the world scale, is pretty remarkable. Of course, the one thing I always have been steadfast about, David, is the deflationary nature of technology. Sometimes when markets actually become most problematic, you can say, is actually the time that the best companies turn to technology and find ways to create efficiency.

David Flynn: You are absolutely right. When the status quo can no longer suffice because of other stresses, then people have to look for alternatives that can change the game. Technology adoption, I believe, is accelerated when there are stresses in the system. That was I think for sure the case with Fusion-io and the introduction of solid state, and I believe we’re seeing the same thing now with the introduction of data orchestration and hyper scale mass. This ability to deliver unstructured data at unprecedented levels of performance and across entire data centers, and across all different forms of storage infrastructure. The two main things are are the data agility that it offers and the data delivery performance, both of which are in very, very high demand right now, with the AI revolution that’s going on.

Daniel Newman: Yeah, we’re going to dive into that in a minute. I will tell you, David, it’s funny because we always … It’s always risky to make those kinds of proclamations about the market when the market isn’t doing well, but it always becomes evident when it is. I remember I wrote an op ed at MarketWatch, talking about all the companies, this was in ’22 when it felt really bad, everything was going the wrong way, and I kept trying to say look, any company that does AI, that creates automation, these are going to be companies that get a really heavy dose of visibility right now because companies aren’t going to away, they’re going to try to find a way to make every dollar go further. Which means first thing you’re going to do is prune the tree, you’re going to cut back and you’re going to figure out how to get more efficient with your current. Then when the market turns, you accelerate based on that.

You’ll have a laugh, but I went on Squawk Box, the CNBC show, it was maybe July, August of ’22 and NVIDIA was trading at $140. It was 140, 150, it was low. I’m like, “You have no idea what’s coming with AI.” I remember, it Becky or Andrew, and they said to me, they’re like, “Are you sure?” We’re hearing it could go below 100.” I’m like, “Look, I’m just saying that you have no idea how important AI is going to be and this company is at the foundation of it.” Let’s move on and talk a little bit about the enterprise challenges though. I’d like to get your take. Enterprises are going through a lot. We’re doing a lot of research right now, we have a lot of data intelligence, a lot of spend, but there’s a lot of uncertainty, too. Boards are telling people like you, “Hey, do some AI.” What are the big challenges that you’re seeing that enterprise organizations are going through, especially as it pertains to infrastructure to really drive AI in their organization?

David Flynn: Well suddenly, the large pool of what was considered junk, unstructured data suddenly has become valuable. We have a number of these … The need to extract more value from data has never been more important than there is now, and the potential to do so never more available than with AI. We can now make sense of computer systems in automated fashion, can get utility from unstructured data, data that is not records in a database. Now, machines can perceive what’s in images, what are in images, what are in videos, what is in audio and make sense out of these things.

Suddenly, all of this what was non-usable without a human in the loop is something that is usable. We need to deliver now these massive quantities of archived, historic, cold data, we need to deliver those massive quantities to something that is human is not in the loop, basically it’s machine generated and machine consumed. That is really unlocking a drive for data delivery performance, capacity. I’ll also say one of the other key things is the fact that this data can come from many different sources and now needs to be delivered into potentially different AI models.

What used to be a very simple relationship, this application puts its data on this storage, that application puts its data on that storage, now has become a cross-product problem. We need the data from all of these applications and that data has to continue to serve its primary purpose to those applications, but now we need all of that data over here, too. Over here, being not just a different application, but one that is so massive that it requires a data center with a ton of GPUs, and you are forced then to have to move the data.

Daniel Newman: Yeah.

David Flynn: I call it the cross-product problem.

Daniel Newman: Yeah, yeah. It’s actually very interesting. The proliferation of data is super interesting. I want to take you down the road here, talk a little bit about the enterprise data center versus the cloud. We hear so much about the cloud, and gosh, the cloud, it’s a teenager now, it’s been around a while. One thing we do know for sure is the proclamations that cloud will be everything have been a massive overstatement.

David Flynn: That’s because the cloud isn’t what it was meant to be. The cloud is just back to the mainframe era.

Daniel Newman: It’s a massive mainframe, is that what you’re saying?

David Flynn: Yeah. Cloud, in concept, was the promise of run anything, anywhere, without having to give one hoot about whose infrastructure it was. But that’s not what it is. If you choose to put your data in AWS, it’s not even that it’s in AWS, you have to choose which one of AWS’s mainframes. Do you put in US East, do you put it in US West? Because once you’ve put your data there, all of the computing needs to be done in the same mainframe.

We haven’t really gone to a utility computing model, where you can plug into the wall outlet and get electricity. You can’t just plug in and get computing power. It’s because data is a highly localized resource, and when you move it into the cloud, now you’re stuck where all of the processing and everything has to happen in orbit of it. They talk about that as data gravity. Data gravity is what has limited the utility of the cloud and why repatriation is such a big deal. You don’t really have the agility of being able to shop your venue, like lease burst space or keep build on-prem space. That whole thing, the hybrid cloud, the multi-cloud, even multi-region within the cloud, none of those things work. The reason we even have those terms is because of the fact that data is siloed into these different places. I like how a Gartner analyst puts it, “The data center is no longer the center of data. For cloud to work, you have to be able to use any data center. It’s not the center of data at that point.”

Daniel Newman: Right. Then that becomes a software challenge, that becomes an application challenge. Unless you have really that highest level of abstraction where you’re just consuming software, you always-

David Flynn: That’s what Hammerspace is. Hammerspace is that level of abstraction that abstracts data from the very storage systems storing it, and allows you to access that data from across geographically dispersed data centers, from whatever storage infrastructure you have in those data centers. We solve what seems like a paradox. How can you have data local to multiple, multiple data centers around the world, local high performance access, without ever having copied that data? Because in Hammerspace, those aren’t copies, they’re instantiations of the self-same file system. That’s why the connection to the name Hammerspace, it’s the data is not an emergent property of storage anymore. Data is something that exists independent of the storage, as an orchestrated asset.

Daniel Newman: Yeah, we’re certainly going to see this industry change quite a bit, as we see this evolution of LLMs and applications that are going to operate very functionally like LLMs. We’ve got our upcoming Summit coming up, and one of the CEOs, one of your peer group I talk to regularly is Bill McDermott. Bill is talking about how enterprise software as a whole is going to be completely rethought in the next few years. The way it looks, the way it feels, the way it operates, the way it consumes. He’s actually going to be giving that talk at our upcoming Summit.

That’s something you’ve got me thinking about that. The way we tag data, metadata, the way it’s going to be accessible. We’ve got all these data management tools, but if storage is done a certain way with the right tagging and architecture, it’s like do you need these tools or does it all become an all-in-one? I think that’s part of what you’re trying to address, yeah?

David Flynn: The biggest thing, and I alluded to it before, the biggest difference I see, because it is going to change everything, but it’s doing is it’s taking the performance and capability limiting factor that is having a human in the loop. It used to be that to perceive anything from a video, or from an image, or from any kind of unstructured data, you had to have a person looking at it. That’s no longer the case.

Now you pull the human out of the loop, it’s machine to machine. That’ll go as fast as we can feed it, so we are no longer bandwidth limited by gray ware, brain power. We have just the hardware and software that can iterate in a loop on it. That I view as the single biggest thing is that, to get things done anymore, it doesn’t require the perceptive capabilities of a human brain and that means these systems can accelerate much, much faster, working through large masses of data generated at the edge. IOT is another thing that has never really lived up to its potential, just like cloud. AI will help unlock that because now you can make sense of all of this data.

Daniel Newman: Yeah, I think that’s … We’ve had analytics tools, and like I said, AI is four decades old now. It’s certainly been democratized and brought into the consciousness, architectures are starting to come along. Of course, accelerated compute that can actually process it. Grace Blackwell, the power that it can perform at from a training inference. But we’re seeing a lot of places … By the way, we’re going to see the architecture, I always say silicon will eat the world, the chips whether it’s an ASIC, an XPU/Soc, or a full programmable GPU, it’s getting more powerful, it’s getting more affordable. We’re seeing costs come down. This is all really, really exciting because we’ve been doing AI for a while, it’s just-

David Flynn: Let me share something with you. “Silicon will eat the world,” I love that phrase. I hadn’t heard of that in the context of this. There was a paper written well over a decade plus ago that was basically talking about when we will hit the point where machines are actually more intelligent than humans from an AI perspective. The thesis was you could simply look at how much energy we as the human race feed to silicon, versus what it takes in caloric intake to feed the human race.

When you talk about silicon eating the world, we are now crossing over the point where we are dedicating more total energy to powering silicon than the caloric intake, than the energy that the human race, as in total, consumes on the planet. At that point, you could say you’re getting more output per dollar, because we are a capitalist society, it’s all about what you’re getting more out of. We’re powering more on the silicon side at that point.

Daniel Newman: Well, I’ll give you a little bit of the background there since we’re enjoying it. I actually wrote a MarketWatch op ed in 2019, ’20, right at the end of the year going into 2020, before the P-word. I don’t say it because sometimes it makes the videos, it messes up the algorithm.

David Flynn: Yes, fair enough.

Daniel Newman: I said that this would be the year, and I didn’t have any idea of what that event, the acceleration, but I was just prognosticating on the fact that I think we’d over-rotated to software and saying software would eat the world, because you can’t run software without semiconductors. What I’m saying is this. As you see software exponentially become more critical, you need a lot more compute, which meant we’re going to need a lot more silicon, which means actually it’s not software eating the world, it’s silicon eating the world. Then of course-

David Flynn: That’s very good.

Daniel Newman: Software’s getting hooked up.

David Flynn: Where this ties into Hammerspace is that the conventional wisdom in the world of data operations is move the compute to your data. You’ve probably heard that phrase, move the compute to your data. This whole notion of data gravity, if I have my data in US East in AWS, that’s where I do my computing. If I try to stretch it to another site, I’m not going to be able to operate on it very fast, it’s pathetic. The move the compute to the data is conventional wisdom. But in a world where these applications require so much silicon, that silicon is actually tethered to our physical world. You have to power it. You have to rack it and stack it. You can’t move computing when it’s at that scale. If it’s just a program that doesn’t have need for dedicated massive numbers of GPUs, then fine, I can run that wherever.

But the thing is applications are so big now that they can only run in a few places, and finding available GPUs is a pain in the ass. We have to get to the world where the data is what moves. The irony is data is the very definition of digital, it’s what ought to be able to be transmitted. What Hammerspace does is solving the software challenge for orchestrating data so that you get a singular, logical view of data that is physically distributed, and can even be moving, so that you can have continuous, uninterrupted access from multiple points to that data at the not just petabyte scale, but potentially into the exabyte scale.

We can now kind of talk about it because Meta has blogged about the fact that we are collaborating, with Hammerspace supplying the high performance file system and data orchestration for the training of their LLMs, the Llama family of LLMs. Great partnership there, but it’s proving out what I’m talking about here at scale, with clusters that have 24,000 GPUs. Zuckerberg has talked about putting in a million GPUs in the not-too-distant future. They’ve already got apparently 350,000 going in. We’re talking about huge numbers. You have to be able to move the data to where you have built up that silicon.

Daniel Newman: All right. David, I’m going to end here with a bit of a dive into a few of the roles that different things are going to play in the future of AI.

David Flynn: Okay, okay.

Daniel Newman: I’m going to give you part A, part B, part C. Take one, two, three or all of them.

David Flynn: Okay.

Daniel Newman: But open source, hyperscale NAS and data orchestration. Talk about the role each of these plays in the future of AI and AI architecture.

David Flynn: Open source and open standards, open platforms are incredibly important because that’s how the industry can stand on the shoulder of giants to build the next generation. We can’t assume to solve problems unless we can build upon things of the past. Being able, and especially now with Linux being so dominant in the data center, at scale even Microsoft has capitulated, so open source and open standards, open platforms, incredibly important.

This is one of the things that really sets Hammerspace apart is that we’re using … My team created the newest standard for networked file systems, for how file data is shared on the network. Our CTO is the Linux kernel maintainer of this portion of the Linux operating system. He’s the guy who, over the past 20 years, has been the maintainer and wrote most of it at this point. How the OS, at its lowest level, consumes network shared data, very, very fundamental things. There were massive innovations that needed to happen there that hadn’t for the past 20 years. Other people had built more proprietary or custom file systems for the super computing world, but not being open platform, it didn’t fit in.

We have basically introduced, for the first time, this high performance parallel file system in an open standards. That is, by the way, what is termed hyperscale NAS. It is NAS, which is standard space, off-the-shelf, plug and play, it just works, but one that incorporates the architecture of these exotic supercomputer class file systems that are very single purposed. Hyperscale NAS is the personification of a standards based architecture that finally includes the needed architecture to get to the performance levels that we need in the AI space. That’s really hyperscale NAS and the standards.

Now the other beautiful thing is that, when it’s open architecture and standards based, now you can start tackling data management because data orchestration allows data to move across any existing storage systems. Because it’s standards based, anything that speaks traditional file, anything that speaks object, anything that is block storage, those are all … Think of it as bookshelves on which you can pack the library of books and that becomes interchangeable. Doing this in a standards based fashion allows you to get hyperscale NAS, that crazy level of performance, and data orchestration, the ability to move and manage data across the world of physical infrastructure.

Daniel Newman: Whew! Take a breath.

David Flynn: Yes. All of these things are key to the AI workflows because it’s pumping large amounts of unstructured data that are the key.

Daniel Newman: Yeah. I think that’s really an important key and a really good way to wrap here, David, is that we’ve moved so far beyond structured being the preeminent data that we use in our business. It is absolutely going to be an amalgamation of different data and different structures. It needs to be very quickly, easily identifiable, object, block, file, all of the above, across structured and unstructured architectures across many different domains, different destinations, different even sovereignties and so much more. Then of course, different types of workloads. There’s so much going on, so much complexity. There’s this abstraction that needs to be created to make it where companies can use it all.

As we know, these LLMs that we’ve actually experienced early on are mostly table stakes of what’s going to be in the future, where it’s your data, your proprietary, inside customer data, not this broadly available internet data.

David Flynn: That’s another one of the main challenges is it needs to be your data on your infrastructure that you keep private to your organization. A lot of this data has to be kept private by legislation. You have to keep that data well protected because it has customer information in it.

Daniel Newman: And it’s going to get more sophisticated. It’s still pretty early days.

David Flynn: That’s right. The challenge is I like to say what’s going on right now with AI is causing a reckoning in the industry that we have to overcome the old model of managing data indirectly by managing the storage and copying data into it. That’s managing data by copy indirectly. We have to get past that world, and the only way to do that is with data orchestration and have data simply move on its own volition, and be something that you can access and use even while it’s in motion across the different shelving, as I put it. This abstraction layer in the data plane has never existed before, and that’s what really has been missing in the industry is the data orchestration, the ability to access data while it’s in motion.

Daniel Newman: David, I want to thank you so much for joining me here. It’s been a lot of fun talking. Clearly, I think we could keep rambling on. I always so, “Oh no, we’re going to talk for 20 minutes,” and here we are, it’s been almost a half hour.

David Flynn: Well, there we go.

Daniel Newman: There’s so much more, and it sounds like you and I are thinking about quite a few of the same exact things.

David Flynn: Same things.

Daniel Newman: Let’s do this again sometime.

David Flynn: Well, it’s been a pleasure.

Daniel Newman: Go ahead.

David Flynn: Absolutely. Absolutely. It’s been a pleasure, Daniel, thank you for having me on the show. We’ll do it again.

Daniel Newman: Yeah, let’s do it again. Everybody out there, go ahead and check out, in the show notes here, we’ve got the hyperscale NAS technology overview from the team at Hammerspace. I’d love for you to hit that subscribe button and join me here, on the Futurum Tech Podcast, for all of our shows. Check out all our shows across the whole Six Five Media network and across The Futurum Group. For this episode, from myself, I’ve got to say goodbye. We’ll see you all later.

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

SHARE:

Latest Insights:

Solidigm and NVIDIA Unveil Cold-Plate-Cooled SSD to Eliminate Air Cooling from AI Servers
Ron Westfall, Research Director at The Futurum Group, shares insights on Solidigm’s cold-plate-cooled SSD, developed with NVIDIA to enable fanless, liquid-cooled AI server infrastructure and meet surging demand driven by gen AI workloads.
In an engaging episode of Six Five Webcast - Infrastructure Matters, Camberley Bates and Keith Townsend explore key updates in data infrastructure and AI markets, including the revolutionary IBM Storage Scale and Pure Storage’s latest enhancements.
Kevin Wollenweber, SVP at Cisco, joins Patrick Moorhead on Six Five On The Road to discuss accelerating AI adoption in enterprises through Cisco's partnership with NVIDIA.
Fidelma Russo, EVP & GM at HPE, joins Patrick Moorhead to share insights on HPE's Private Cloud AI advancements and their future AI endeavors.

Thank you, we received your request, a member of our team will be in contact with you.