On this episode of The Six Five – In the Booth, host Paul Nashawaty is joined by Fermyon Technologies‘ CEO Matt Butcher, for a conversation on how WebAssembly and SpinKube are shaping the future of cloud computing from KubeCon Paris 2024.
The discussion covers:
- The journey of WebAssembly crossing into real-world production use cases
- The exceptional performance WebAssembly offers and its impact
- Introduction of SpinKube and its successful implementation at ZEISS Group
- Examining the multi-dimensional performance enhancements including cold starts, throughput, and density
- The advent of “true serverless” computing and its significance for developers across skill levels
Learn more at Fermyon Technologies.
Watch the video below, and be sure to subscribe to our YouTube channel, so you never miss an episode.
Or listen to the audio here:
Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.
Transcript:
Paul Nashawaty: Hello and welcome to today’s session. My name is Paul Nashawaty and I’m in the Fermyon Technology booth. I’m joined by Matt. Matt, welcome.
Matt Butcher: Thanks for having me.
Paul Nashawaty: Yeah, it’s great. Great. It’s a really exciting times in the space. Look at all the activity around us and cloud native space. But let’s talk about Wasm. So what’s Fermyon do?
Matt Butcher: So Fermyon is building the next wave of cloud computing, and we’ve focused on web assembly, as that technology that enables a class of cloud computing we just haven’t been able to crack up until this point. We like to talk about cloud computing as having waves. The first wave of cloud computing was the virtual machine based wave, and it was when a little bookstore named Amazon suddenly became a big technology company. They brought this idea that you could lease their hardware and run your virtual machines on there, and that was really the first wave. Then the second wave came along with the container ecosystem, where instead of packaging up these big gigantic images of an operating system and shipping it across the wire and running it and dealing with a minutes long cold start time, you just package up like a little pie shape slice of your operating system that’s got just those core files that your server needs, just that part of the file system you need. And then you can package that in a container, ship it somewhere and run it in an environment like Kubernetes. It turns out to be an excellent environment for running long-running processes.
But what we started to see was this real interest in short-lived processes and like that event based programming model, where a request would come in, your software would launch. Handle that request, return a response, and shut back down. And so when you’re scaling and 10 requests come in, you’re starting up 10 instances of this processing as fast as possible and shutting them back down again. And in order to be able to run workloads at any scale using that particular model, you’re going to need some runtime that can cold start very fast, that has very small binaries that can very quickly load into memory and execute. We couldn’t achieve that level of compute with virtual machines. We couldn’t do it with containers. And so we started looking around for a technology that had all the attributes you would need to run that kind of environment. Web assembly very quickly bubbled up on our radar, and that’s why we’re so excited about it and have built all of our tooling on top of this really cool nascent technology.
Paul Nashawaty: It really is an exciting time. And when we talk about the business challenges, we see the application monetization is growing. But this is more than just an academic exercise. Let’s talk about how web assembly is really crossing the chasm into real life use cases.
Matt Butcher: So this morning on the keynote stage, we saw Michelle Dhanani from Fermyon join, Kai Walter from ZEISS and Ralph Squillace from Microsoft. And talk about this really cool case that ZEISS has where they get a huge influx of orders, batches of 10,000, and they need to send them through this pipeline of data transformations before sending them off to the production facility. And being able to process those loads can be very expensive in Kubernetes because you have to provision for this max capacity. And what you really want to do instead is have a system that can scale up very rapidly, reach very high density. A lot of things running in only a modest amount of hardware. And then scale back down as near it to instantly as you can. And it was exciting to see that presentation on the stage today because that’s one of those cases that we’re hearing as people come and say, “In manufacturing, we’ve got these near edge, far edge configurations where we have limited hardware capacity. We need to squeeze every last bit out of them. How do we do that?” Well, density and web assembly is a really good way to go. I talked to a large publisher today.
They’re saying, “You never know when you publish an article when it’s going to go viral.” Being able to scale up and then scale back down again as close to instantly as possible is critical for them to be able to do their job, is critical to be able to meet the demands of the market. But they don’t want to swallow the cost of having all this Kubernetes stuff running at idle. And the typical container implementations take so long to start up that you have to scale up before the load comes in. And when you can’t predict the load is going to come in, you end up staying scaled up much more often than you want. And so it was really exciting to see them come and say, “This is the perfect use case for these kinds of cases, where if something goes viral, scale up instantly, scale back down as soon as it’s done, and you don’t have a large sunk cost of keeping those compute instances running all the time.”
Paul Nashawaty: Yeah, so far from academic.
Matt Butcher: Oh yeah.
Paul Nashawaty: It’s far from academic. There’s clearly a great use case here. Performance, that’s really one of the things that we’re seeing. When we look at this application modernization approach and we look at that burst capacity, that boost moving up, you can over provision and it’s very costly. And so you don’t want to do that with the traditional models. So when I talk about the story of past, present, future, I often talk about laws as maybe being a future set of technology that’s really helping with that growth of those applications. What are your thoughts on performance?
Matt Butcher: Yeah, I think the big word we tend to use is efficiency because it very much captures performance, scalability, and cost control. And performance has been a critical piece of web assembly for us. When we built the Spin runtime, the core of all of our offerings, Spin has to be able to cold start applications nearly instantly. So you look at a virtual machine, it takes minutes to start up. A container, dozen seconds, two dozen seconds. Even an Amazon Lambda function takes about 200 to 500 milliseconds. Spin takes one half of a millisecond. So we’re talking orders of magnitude beneath … Faster than what we’ve had before. And that’s what enables you to be able to scale up really rapidly and scale down. It’s what enables that density story because when you can move things around faster, you can free up resources faster and consequently get more value out of it. And of course, the cost story falls right out of that. Because if you need less hardware and you can run things more densely and more efficiently, your cost of cloud is going to go down.
Paul Nashawaty: So it’s not just cost, but it’s also performance. And you mentioned ZEISS. So let’s talk about … On a recent brief, we talked about Spin Cube and how Spin Cube can help with ZEISS. How did that play into it?
Matt Butcher: And that was … So Spin Cube is our web assembly in Kubernetes offering. And Kubernetes has become one of the most popular, probably the most popular way, to orchestrate container-based workloads. And it’s proven to be a really effective platform for what is effectively distributed computing, where I can push my workload out and have it scheduled across my entire cluster. But again, being based on containers, it’s had that slow startup time issue there. And consequently, Kubernetes is known for having this over-provisioning problem. That’s where when ZEISS saw this. They said, “Oh, this is really interesting to us”, because we can pack a lot more of these web assemblies, serverless functions into very modest hardware than we would ever be able to achieve with containers. And consequently, we’re handling that.
Like they showed today, those 10,000 queued requests much faster or at least as fast as anything they’d have before. But just a minuscule amount of the compute capacity they had needed before. Spin Cube is really that technology that enables that. So provides you the way to install this into your Kubernetes cluster, manage these kinds of applications. And the best part about it is that it all just feels like regular Kubernetes. You can run WebAssembly containers side by side or WebAssembly serverless function side by side with docker containers. It all just works. And I think that’s a powerful way to enter into a marketplace that already has established a lot of patterns, a lot of security best practices, a lot of development tooling. And we just fit right into that particular environment with something like Spin Cube.
Paul Nashawaty: Well, I wrote down during our briefing, I was talking about how Spin Cube and the deployment from a performance perspective is really multidimensional. It really shows in that cold starts. I wrote down 200 nanoseconds and throughput being 60% better and in density being 50 times better than … And we see it in the booth here. We see the messaging. It’s amazing. It’s amazing how fast. So let’s talk a little about those claims and why those claims matter.
Matt Butcher: And again, the reason they matter is because those are ways to really measure what you’re talking about when you say efficiency. If you can run 50 times the number of apps per node in your Kubernetes cluster, 50 times, you can see right there that gone are the days where we have to run these thousand node clusters with huge, huge allocations of memory and huge number of cores, because we can really start fitting the applications in a nice tightly packed configuration. And again, that sub millisecond startup time means that we can just churn through the requests as fast as they come in and not have to worry about this latency where the request comes in, there’s a stall, and then something starts up and it returns it. And then this thing hangs around for a little while. Every little bit of time you can shave off of those is CPU that gets reallocated for something else.
I think as we see compute really come into the limelight again now, with both GPU compute for AI workloads and traditional CPU compute, because we’re all trying to control the cost of cloud, it makes a lot of sense when we can say, “Okay, you’re hitting a density of 50 times.” That means you can take modest hardware and accomplish huge, huge workloads coming through there. Or as you get into the very GPU intensive world of AI development, you can do a lot more with those really costly GPUs, as opposed to the kind of traditional model of lock A GPU for a long period of time, like days, months. You’re locking them for a second here, two seconds there, a couple milliseconds there, and then you’re releasing them. And consequently, once more of that, density story bubbles right away up to the top.
Paul Nashawaty: Yeah, absolutely. When we look at the approach that we see a lot of times is a truly serverless approach. And so when we look at Wasm, how do we educate the market on what that means.
Matt Butcher: And that’s a fantastic point because the term serverless has gotten a little oversaturated as it got applied to some things that maybe weren’t actually serverless. It became an unclear term. So we wanted to really carefully talk about what do we mean when we say serverless? What is the server that we’re doing without? And to us, the best way to articulate, it was just in the plain old terms of how people do software development. So typically a microservice for example, when you write the code, you write a long-running server process that has a sophisticated handler, set of handler functions in it. And it stays running for hours, days, months. Maybe even years in some cases. And it’s just running and accepting requests, sending out response over and over again.
As a developer, that means I have to code all of that and maintain all of that code around starting and running and maintaining this code. Every time there’s a new open SSL version, I’m patching my code, I’m shipping everything out. Serverless is really that practice of saying, “Okay, we don’t need that server part in the code. Something else can handle delegating requests.” All I am writing, my only code artifact, is an event handler that takes a request, does its processing, returns a response, and shuts down. So we’re going from hundreds or thousands of lines of code, to potentially just a couple of dozen lines of code. Which means easier to maintain, smaller teams can maintain them, easier to find vulnerabilities, easier to edit and fix bugs. And that I think ends up translating very much to an efficiency story we haven’t talked about, which is the developer efficiency story.
And serverless I think really is resonating with developers, particularly cloud native developers, because it provides them a way to be very productive, very quickly, and to be able to do this kind of nimble small teams that can keep a number of applications running without breaking a sweat and without being on call all the time. Most importantly, I think though, in this serverless world, is there’s always been a little bit of tension between the platform and engineering team, the operations team and the development team. The developers build a program, they deploy it into production. At that point, it becomes the production team’s responsibility. If a vulnerability happens and it requires rebuilding the software, that team has to kick it back to the development team. And it becomes like one of those tension processes. “Hey, we need you to fix this today.” “Well, we’re busy today.” That kind of thing.
If you can really constrain those binaries to the serverless paradigm where it’s a small amount of code and you’ve removed a lot of the attack surface in the sense of the SSL libraries, the socket libraries, all those areas where we always hear about vulnerabilities or bugs that end up crashing sites, if we can take all of that out of that particular environment, then that tension between the operations team and the development team begins to dissipate. And the operations team is more empowered to be able to do what they need to be able to do, while the developers are more empowered to do the work they need to do without necessarily that interruption driven process between the two. And I think that’s a big part of what’s attractive about serverless as a development and operational paradigm.
Paul Nashawaty: Yeah, I really like where you were going with that because the fact that the developer, it almost levels the playing field too. Because you give that … Tell me any developer here that wants to say, “I want to work in maintenance mode.” They want to focus on innovation. They want to-
Matt Butcher: “Please, put me on the 2:00 AM shift.”
Paul Nashawaty: Yeah. Of course, why not? But that just gives you that kind of a way to advance your development process, be innovative and really focus on the next generation. But with that said, and as we’re wrapping up our last bullet here, is this an all or nothing approach? Do I have to take an all laws and approach, or can I have this kind of a hybrid environment?
Matt Butcher: And that’s really cool about Spin Cube. So here’s a technology that says, “Hey, you’ve already invested in Kubernetes. Your platform engineering team has already done all the work to batten down the hatches to build run books. Everything is well into production.” Nobody’s going to want to say, “Okay, well let’s just put that all on the shelf. Let’s start again with Wasm and let’s rebuild all the code.” So what we wanted to do is with Spin Cube was build away where you could run those web assemblies like inside of your existing Kubernetes cluster, side by side with your container based applications, often interacting with each other with no problems whatsoever.
And the security team is still auditing the same surface and they still understand how to check that. All the service mesh still works. Platform engineering is largely the same. The developer experience should be simplified by Spin and these smaller serverless functions instead of the bigger applications. But it ends up just working together with an existing technology, in such a way that platform engineering teams can rest confident that they’ve got a technology that they can operate successfully on day one.
Paul Nashawaty: Nice. That’s really impressive because that way, it allows for rapid acceleration and faster time to value. But also expand to innovation and focus on those areas where you can move towards that future growth. So Matt, as we’re wrapping up, what are some parting words you’d like to leave the audience? Where can they go to get started?
Matt Butcher: The easiest place to go is fermyon.com/platform. You can read about our platform, get a quick link into some of the videos that will show you some of these performance gains, some of the documentation that’ll show you how it works within Kubernetes. And plenty of material where you can say, “Is this for real? How are these numbers really achieved?” And read through some of our white papers.
Paul Nashawaty: Very good. Matt, it’s always a pleasure talking to you.
Matt Butcher: Likewise.
Paul Nashawaty: And I have respect your insights and your perspective. And I also want to thank the audience for attending our session today. And with that, feel free to contact us at thefuturumgroup.com.
Author Information
At The Futurum Group, Paul Nashawaty, Practice Leader and Lead Principal Analyst, specializes in application modernization across build, release and operations. With a wealth of expertise in digital transformation initiatives spanning front-end and back-end systems, he also possesses comprehensive knowledge of the underlying infrastructure ecosystem crucial for supporting modernization endeavors. With over 25 years of experience, Paul has a proven track record in implementing effective go-to-market strategies, including the identification of new market channels, the growth and cultivation of partner ecosystems, and the successful execution of strategic plans resulting in positive business outcomes for his clients.