On this episode of DevOps Dialogues: Insights & Innovations, I am joined by serverless product management team lead at Google, Steren Giannini, for a discussion on Google Cloud AI and the impacts on application modernization.
Our conversation covers:
- The impact of Gen AI applications
- Refactoring all applications to leverage AI models
- Kubernetes to migrate to the Cloud, getting started with Cloud Run
These topics reflect ongoing discussions, challenges, and innovations within the DevOps community.
Watch the video below, and be sure to subscribe to our YouTube channel, so you never miss an episode.
Listen to the audio here:
Or grab the audio on your favorite audio platform below:
Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this webcast. The author does not hold any equity positions with any company mentioned in this webcast.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.
Transcript:
Paul Nashawaty: Hello and welcome to this episode of DevOps Dialogues. My name is Paul Nashawaty. I’m the practice lead for the AppDev practice at The Futurum Group. Today I’m with Google Cloud and Steren, would you like to introduce yourself?
Steren Giannini: Hi Paul, thanks for having me. I’m Steren Giannini. I lead the serverless product management team at Google Cloud. So we have a portfolio of product that includes Cloud Run, cloud functions app engine and I’m very glad to be here.
Paul Nashawaty: Well, thank you for being here. It’s a great, great time to be in application modernization a lot of need a lot of demand in the market today. We’re here to talk about GenAI applications and talk about what the impact of GenAI applications are. But first, why don’t we start with a little bit of background. Why is this important?
Steren Giannini: Yeah, so we’ve all been seeing how application development has changed a little bit. Notably to include some reasoning engine, notably via the rise of large language models. So those types of applications are in a sense, a little bit similar to the applications we were used to build before, which for example, were serving requests like a web API or a website. Except that they make a call to a model, an AI model, a large language model, not at the time. So in a sense that’s important because it changes a little bit how we develop applications. Those are a type of applications that look a lot like what we were used to, but there are some specificities that I’m happy to dive into if you want to.
Paul Nashawaty: Yeah, absolutely. And when I think about it, because actually a good point. I talk to a lot of organizations and they talk about refactoring as a big part of their business and their business KPIs. When we talk about refactoring, a lot of times organizations think that they have to do all of their applications, they have to refactor all their applications. Is that necessary to take advantage of AI?
Steren Giannini: So I think it is a little bit because you will need to integrate with this AI model. This is the new part that was probably not existing in your application before in this case. So I should say if you go from on premises to the cloud and then look at embedding a language model, then you would probably take it in two stages. So the first stages will be modernizing to the cloud. So here at Google Cloud we recommend containerizing your application that will allow you to then deploy it to any of Google Cloud’s container run times. And notably Cloud Run is one of them, probably the easiest one to get started with really to get to production in minutes.
So that’s the first step is moving your application to the cloud by containerizing it. And once you’ve done that, if you want to integrate AI into your app to potentially make it able to reason or to remember things in that case, you will need to indeed modernize the code base of this application to make at least an API code to a large language model that will be able to take the user input and format the response back to the end user in case, for example, you are building a chat bot.
Paul Nashawaty: Yeah, that makes sense. And when I think about maturity of applications, there is different levels of moving to the cloud. So organizations may choose to encapsulate a heritage application and put it into a cloud-ready state, right? Move it into the cloud so it’s sitting in the cloud. Can those applications take advantage of AI if they were just say, encapsulated as heritage applications?
Steren Giannini: Yes, most of the time, as I said, this heritage application, your primary goal will be to containerize it. Once you’ve done that, you know can port it anywhere, containers are supported. That means major clouds, and even within the cloud you will have different run times like we’ve talked about, Kubernetes engine, Google Kubernetes engine, GKE or Cloud On, being two of Google Cloud’s ones. And once you’ve done that, then indeed it needs some new development to this old application to make it in a sense called an AI model. Now you can also decide that this new development, you’re not going to be bringing it into the old code base, but you’re going to be developing a standalone microservice for example, that will only take care of this GNI part.
So that’s often what we see that you kind of do greenfield development just for this new feature, but your main business logic stays in a container and on your old code base, right? So this new feature you might be adding, it could be summarization of documents, for example, this new feature. And you might want to develop it as a standalone service, either real-time processing. So if a user asks to summarize a given documents, then you would take that input, you would ask a large language model to summarize the document and you will return that result. If this is more done by batch, you could potentially use batch jobs to do that in the background.
And for example, every day you’re going to be summarizing all of the documents that have been uploaded during the day, for example. So in Cloud Run at Google Cloud with Cloud run, we offer two solutions for that. So if you want to do request response type of workloads to build a generic application, you would be using Cloud Run services. So these are taking, as they’re exposing a unique endpoint, this endpoint can be a private microservice in case you want to embed it as part of a larger architecture or this endpoint can directly be exposed to the end users.
So that will be the end user kind of input, something like a document or a prompt. And then this service would potentially call the database to retrieve more information that’s called retrieval, augmented generation RAG. And then it would invoke large language model to format a response and it would stream it back to the user. The streaming part is very important because those language models, they can take time to craft the full response. So if you stream it’ll appear to the end user that things go faster.
So in the case of Cloud Run services, Cloud Run is I think the only serverless platform that offers streaming out of the box. So by streaming, I mean returning the response via HTTP, chunk transfer, encoding, or even HTTP two or even WebSockets, all of that is supported as a service. Now if you want to do the batch jobs, then Google Cloud will offer you Cloud Run jobs, which in that case potentially on the schedule you will trigger the processing of end documents and each task of that job will process one document potentially summarizing it and storing the summary for example, in a database. So that’s just one use case that we’ve developed, like summarizing documents that you have so many other use cases we can think of for Gen.AI.
Paul Nashawaty: So there, there’s a lot there to unpack. I mean you went through a lot of details and given that this is the DevOps Dialogues series. DevOps teams are getting started on, some of these organizations are just getting started, other organizations are well on their journey. How would you, when you think about GKE and you think about the tech stack that goes along with GKE in that monetization effort, how would you recommend somebody just getting started to get going? What’s a frictionless way to get going?
Steren Giannini: I think this is where if you’re not yet already on Kubernetes and you want to migrate to the Cloud, this is where we’ll recommend you to get started with Cloud Run, which is actually the simplest to get started. There’s no cluster to create or manage. You just enable Cloud Run and then in one command, if you use one of the supported languages which are Java, Python, PHP Go, then it is likely going to be just deploying to Cloud Run. Now, what’s happening when it deploys to Cloud Run is that the first step is it containerize the application.
So if it’s a Java Spring Boot application, for example, the system, it’s called Google Cloud Build Packs, it’s actually a standard to build containers. The system will recognize, oh, that’s a spring boot application. I know how to build that and it’ll build it into a container. That’s step number one. Step number two is that it’ll push that container to a container registry. And step number three, it’ll deploy that container to Cloud Run and Cloud Run will be on demand, not having to provision any cluster and really taking care of everything for you in term of default values. The only thing you have to provide is your code base. Now the beauty of that is that this container is portable now. You have a portable container that you can go and deploy to GKE later if you want to adopt Kubernetes and Google’s managed Kubernetes offering is GKE.
Paul Nashawaty: Yeah, that makes a lot of sense. It sounds like you instituted or Google instituted a big green button to push the big green button. One of the things that I hear in our research or I see in our research quite a bit is portability. You keep using that word, right? Actually, we see in our research that 20% of respondents indicate that application portability is critical to their business. 67% say it’s very important to them. So there’s definitely a need to be portable. Now this becomes not just from a burst capacity or a compliance or regulations or workload capacity, but also from Core to Edge to Cloud and back and forth, right? Do you see a lot of customers or clients that are looking for that kind of portability?
Steren Giannini: Absolutely. I mean, today container images have become the industry standard for packaging software. And when we designed Cloud Run in the early days, we took the learnings from our previous products like cloud functions or even app engine. Those products were proprietary in term of APIs in terms of deployment artifacts. And we heard from customers that when they adopted those products, migrating to a different runtime was hard. So for when we built Cloud Run, we took the early decision to standardize on the container image artifact, which is the exact same image you can run on the local machine, on GKE, on Cloud Run, or even on other clouds. And that’s the beauty of standardization that guarantees the portability of the software artifact.
Now there is another layer of portability, which is the API portability. And this is one we really went above and beyond with Cloud Run, which is today the Kubernetes APIs have become a little bit of a standard too. All of the Kubernetes resources have a very similar structure. The APIs work exactly the same. And so when we designed Cloud Run, we’ve actually designed the Cloud Run API to look a lot like Kubernetes in order to enable portability. So you can literally copy paste your YAML of Cloud Run into a Kubernetes cluster, you change service to deployment and it would deploy because the container is a standard artifact and the API actually has been designed for portability. So yes, portability matters a lot nowadays and I should say portability of the compute. So the software is probably has become the easiest now and the hardest is now probably portability of the data. Your data is probably tied to a database type and that becomes harder to migrate to a different database type, for example.
Paul Nashawaty: I’m hearing a lot of organizations shift their focus from SLAs to more SLO kind of focus, right. For service level objectives. So they want to keep the application up and running and make sure it’s running, even though they might have the infrastructure running, but if the infrastructure’s running the applications down, it doesn’t really matter. When I think about Google Cloud and I think about Cloud Run, and it seems like there’s an accelerated path to get organizations to the Cloud and space. As we come to the end of our session here, what would you recommend or how would you advise the audience on where they should go to learn more about this?
Steren Giannini: Yeah, definitely if you are onboarding onto that journey, go check out Cloud Run the URL is quite simple http://cloud.run and of course check out the video that will be recording later today that will introduce Google Cloud, introduce Cloud Run, and introduce how to build GNI applications on Cloud Run, which is in a sense, just a new type of applications, like a web application, but with some specificities. So really I think that’s definitely the easiest to get started. Customers have told us they went from zero to production in just a matter of weeks. Most of the time doing a cloud on demo, like I will do later today in one minute, you go from nothing to deployed auto-scaled web service packaged as a container image. So it’s definitely the simplest to get started with Google Cloud. Check out, Cloud Run.
Paul Nashawaty: Thank you. That makes a lot of sense. And I also want to echo the fact that we’re seeing in our research that just nine months ago, only 18% of respondents indicated that they’re running AI in their production application in their production workloads. Moving today, 54% are now using GenAI and AI in their production workloads. So there’s definitely a move in this direction. But I want to thank you for your time, your insights, and your perspective on what’s happening here and what’s happening with Google Cloud and Cloud Run. It’s really exciting. I want to thank the audience for watching our session today as well. And for more information, please go to Google.com or Thefuturumgroup.com. Thank you.
Other insights from The Futurum Group:
Google I/O 2024 – The Futurum Group
Why AI Innovations from Google I/O 2024 Matter – The Futurum Group
Google Q1/24 Earnings – The Futurum Group
Author Information
At The Futurum Group, Paul Nashawaty, Practice Leader and Lead Principal Analyst, specializes in application modernization across build, release and operations. With a wealth of expertise in digital transformation initiatives spanning front-end and back-end systems, he also possesses comprehensive knowledge of the underlying infrastructure ecosystem crucial for supporting modernization endeavors. With over 25 years of experience, Paul has a proven track record in implementing effective go-to-market strategies, including the identification of new market channels, the growth and cultivation of partner ecosystems, and the successful execution of strategic plans resulting in positive business outcomes for his clients.