How to speed up GenAI? Find out how on this episode of Six Five On the Road at AWS re:Invent with host Keith Townsend and Elastic’s Ken Exner, CPO, for a conversation on how Elastic is at the forefront of accelerating generative AI (GenAI) innovation.
Fast track this ⤵️
- Insights into the adoption of generative AI applications among Elastic’s customer base and how Elastic facilitates the acceleration of Gen AI initiatives.
- Future directions for Elastic’s product portfolio with the integration of AI and machine learning.
- Developer feedback on Elasticsearch’s usage in GenAI projects and its prominence as the top vector database.
- The launch of Elastic Cloud Serverless and Elastic’s commitment to balancing usability with flexibility for both developers and end-users.
- A reflection on Elastic’s product developments in the past year and anticipations for innovations in 2025.
Learn more at Elastic, The Search AI Company.
Watch the video below at Six Five Media and be sure to subscribe to our YouTube channel, so you never miss an episode.
Or listen to the audio here:
Disclaimer: Six Five On the Road at AWS re:Invent is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Transcript:
Keith Townsend: So, if you’ve been following coverage from Six Five On The Road at AWS re:Invent, you know we’ve been talking GenAI. But not GenAI in the unusable sense. If 2023 was the introduction to GenAI, 2024 is GenAI made real. And we’re here with the Chief Product Officer of Elastic, Ken Exner. Ken, welcome to the show.
Ken Exner: Hey, Keith. Good to be here.
Keith Townsend: You know what? Search, a critical part of not just AI but the business process. Talk to me about how GenAI has changed the approach to enterprise search.
Ken Exner: I think it’s kind of an evolution. If you think about Elasticsearch, Elastic is the company behind Elasticsearch, the open source search engine. People knew us historically always for text-based search or lexical search. But along the way, around 2019, we became a vector database, because we needed to store dense vectors and allow people to query dense vectors, because they wanted to do things like image search, where you need to basically do a vectorized image and do nearest neighbor searches on images. They wanted to do things like a semantic search, natural language question answering. And in order to do that, we had to essentially become a vector database.
So when the generative AI boom happened a couple of years ago, we were ready for it. We were already a vector database, and customers wanted to start doing things like conversational search. So you went from lexical search, to semantic search, to conversational search. And they wanted to do things like RAG. They wanted to pass context to an LLM Retrieval Augmented generation, and they wanted to ground those LLMs in their data. So, we were looking at this as an evolution of the things that we had always already done. And for customers that were already using us, it was great, because they could essentially very easily go from text-based search, to semantic search, to conversational search. And it was a very easy evolution for them.
Keith Townsend: So, talk to me a little bit about that data pipeline, and whether that has changed. A lot of the focus around preparing your data for AI, is RAG, Retrieval Augmented Generation, this ability to vectorize your data and make it searchable to do the math. How are customers advantaged by already using Elasticsearch in that data pipeline?
Ken Exner: The reason people want to use RAG is that a couple of reasons. One is they want to reduce hallucinations, they want to ground an LLM on a particular corpus of data. So, the way you reduce hallucinations is by making sure the LLM is grounded on particular corpus of data. But the other thing they want to do is make sure that these LLMs understand their private data without being trained on them. So, if you are a business where you have a bunch of private data, public LLMs, foundational LLMs aren’t trained on that. So, how do you get an LLM to answer something based on private data? You use RAG, Retrieval Augmented Generation. So, why customers have been benefiting from using us, is we offer that out of the box. It’s a very simple thing for our customers to adopt if they’re already using us. But we’ve also been doing this for so long, and we built this into our core search engine. So it benefits everything else, what we’ve been doing.
So, all the enterprise capabilities that we’ve always had apply to vector search too. So if you are wanting to make sure you have durability, we have replication that happens automatically; if you want RBAC, or ABAC, or audit logging, if your InfoSec team has certain requirements for enterprise-grade security, we already have that, because we’ve been doing this for so long as an enterprise-grade search engine. So customers really, really love the fact that they can use us for these enterprise scenarios, and they can use us at scale. But because we are a search company, we’re really good at relevance. We understand that it’s not just about being a vector database. A vector database is fairly simple. It’s basically storing dense vectors in a vector database and then doing a KNN or an ANN query on it. That’s simple. But what’s hard is understanding the entire process around that.
So, if you’re going to want to ingest data from various different data sources, you need to have connectors to those different data sources. We have 250 connectors to different data sources. You’re then going to want to chunk up the data, and have a chunking strategy. They’re going to want to run inference on the data. All these things are much more than vector databases. And we’ve been doing all of this, the entire workflow. So, our customers get essentially not only the best vector database, the most downloaded vector database, enterprise grade vector database, but the entire surrounding workflow around it.
Keith Townsend: So, let’s talk about how this has impacted product. There’s obviously opportunity. You’re both an engineer, and you own the actual product. How has the products evolved over the past couple of years?
Ken Exner: Well, we started off with sort of these foundational primitives, making sure that we could get these things right, be a really good vector database, be a really good inference service, be a really good re-ranker, the different parts of this workflow. But one of the things we’ve been starting to do is building abstractions on top of this that make it really, really easy for our customers to adopt this. So, I like to think of things as build the right foundational primitives, get those right, and then start building abstractions to make it easier. For example, we introduced something called Semantic Text this year, which is basically you can flip the index type to semantic text, and it automatically turns on semantic search for you.
So you can go from lexical search to semantic search doing certain natural language question answering by simply changing a type. And it automatically figures out the chunking strategy for you, automatically figures out how you’re going to run inference on your data and which models to use. All of that is taken care of for you. But you can still drop down to the underlying primitives if you want to. So, I’m excited about this idea of building up all the right foundational primitives, but then layering abstractions to make it easier.
Keith Townsend: So, we’re at AWS re:Invent. AWS likes to talk about the builder’s journey. Builder, another word for developer. And Elasticsearch, easily the most downloaded vector database and probably most projects. What has been the conversation with developers? What has been that feedback from that specific persona?
Ken Exner: Part of this is what I was alluding to, that they love knowing that we have all these powerful capabilities, we can help them with everything that they need for doing RAG, but they want it to be easier. They want it to be easier so that they don’t have to think about how to integrate these various things. So we’ve been trying to provide those abstractions that allow them to integrate and get a really simple experience, but still be able to have the power of the underlying platform. So that’s super important for them. But that idea of layering abstractions over primitives is not just about us, it’s about our ecosystem. It’s about how we work with partners and integrate partners. So, when we develop an API for doing re-ranking, it’s not just about re-ranking with our models, it’s about working with our partners and integrating their models into this.
So, a lot of what we’ve been working on is not just integrating our pieces, but integrating the entire ecosystem. And in both directions. So, we actually recently launched a generative AI partner ecosystem. And people were asking, “Well, what are you going to do?” And I said, “It’s not about what we’re going to do. We’re celebrating what we’ve already done, which is we’ve been working with these partners to provide integrations in both directions.” So if you look at Google and Vertex AI, for example, they integrate us as a vector database. So within the context of that, you can use us. OpenAI, OpenAI Studio integrates us as a vector database. And we do the same thing in reverse, where we integrate with LangChain, and LlamaIndex, and we integrate Cohere and Mistral into our offering. So we’ve built this community that is committed to providing these integrations that make it easier for developers to use this together.
Keith Townsend: So, let’s talk about another level of ease of use. Computer science continues to grow. The abstraction goes up another level. We could not leave an AWS re:Invent event without talking about serverless. You folks just recently announced a serverless cloud. In my mind, this helps simplify the lore end about that. If I want to get down into these lower primitives and change knobs, great, but I have more work to do. I have to select models, I have to do traceability and understand what are my models doing. I have a whole different set of child problems around observability. I don’t want to worry about that stuff. Talk to me about your serverless announcement.
Ken Exner: So, we announced the General Availability, GA, of our serverless offering. It’s available on AWS today. We’ll be adding GCP and Azure early next year. But it’s a complete re-architecture of Elasticsearch. We’ve taken Elasticsearch, which is essentially a database, and made it a stateless database, and made it a fully managed database. So, it was a pretty significant effort. So for our customers today, they can have the self-managed offering, which is you download it, you run it yourself. They have the Elastic Cloud hosted offering, which is we will provision instances, install the software, keep it patched, but you are responsible for cluster health, you’re responsible for sharding, scaling, things like that. So it’s a shared responsibility model.
But with serverless, it’s a completely managed offering. It’s fully managed. You can think of it as the comparison between RDS and DynamoDB, where it is a fully managed offering, and it scales automatically, it shards automatically, cluster health is taking care of for our customers. It’s versionless, doesn’t have a concept of version. It’s kind of like a SaaS offering. So, there are very few companies that have been able to do this. Have a database that is self-managed, hosted, or this SaaS-like offering. And we’re very proud of this because it delivers a really great experience to our customers. And because it’s built on the stateless architecture based on S3, you can think of it as being like a data lake-style architecture, but without the limitations of a data lake. So, you have blazing fast queries on top of a data lake, so you get the benefits, the cost benefits of being on S3, you get the durability benefits of S3, but you have this really fast query. And for use cases like vector search, we are essentially now like an infinitely scalable vector database built on top of S3, which is amazing.
Keith Townsend: So, you beat me to the punch a little bit talking about how your solution is built on S3 itself, but big announcements coming out of re:Invent S3 tables, game changing for a lot of folks. You mentioned DynamoDB, I talked to a couple of customers who were like, “You know what? I don’t really need DynamoDB anymore because the tables have everything that I need in there from a metadata perspective.” How does a announcement like a S3 tables impact a serverless product that you folks are delivering?
Ken Exner: We’ll have to figure out if there’s something that we can use there, but we use just the raw object storage within our offering, and we are able to provide our indexes based on these object stores. I think the tables offering is a little bit more of a, the things that people have been doing in traditional databases like relational databases, can you do those kinds of workloads on S3? I think it presents more of a question about, what is the future of a relational database? Or what is the future of those type of workloads? Can you do those on pure object storage? It’s interesting. I’m curious to see where things go.
Keith Townsend: I’m curious. You know what? Off camera, maybe we can have a detailed conversation about, is this the end of databases? No, there’s limits to what you can do with the metadata end tables, and you’re always going to need these secondary products alongside it. It’s why we’re interviewing you here at re:Invent. With that said, give us the highlight reel of 2024 and a preview for 2025.
Ken Exner: So you’re asking me one of my favorite things from 2024?
Keith Townsend: Their favorite things from a product perspective, because 2024 has been a rough year. I’m a sports fan and I’m from Chicago, so I’m emotional right now. Let’s put some guardrails around the conversation. What have been your product features from 2024 that you’re most proud of? And then looking to 2025, what are you excited about?
Ken Exner: Well, serverless is a big deal for us. So being able to offer a fully managed stateless version of Elasticsearch is amazing. But other things I’m proud of, some of the performance improvements that we’ve driven into our vector database. We launched something called BBQ, Better Binary Quantization.
Keith Townsend: I love that name.
Ken Exner: Love the name. BBQ. It’s so awesome. But you can think quantization is compression for vectors. And what it does is it delivers 95% memory reduction. So, it is a huge savings. But it does this without compromising accuracy. And that’s one of the challenges of compression or quantization, is they typically you’re trading off accuracy for performance, but it does it without that compromise. So I love that. I love that we can deliver the memory reduction and ultimately the CPU and the storage reduction without compromising the accuracy. So that to me is amazing. Other things I’m proud of, we also provide solutions for security and observability. And I’m really proud of some of the things that we’ve done to use generative AI in those solutions. So, we use generative AI, not only power generative AI for developers, we use them in our solution. So, we have a security analytics platform, for example, that uses some of these foundational primitives. And we use generative AI to automate some of the workflows for security analysts.
And one example of this. We launched something called Attack Discovery. And this is probably one of my favorite things from this past year, which is, if you are a security analyst, you are dealing with tons of alerts that you’re having to look through every day. And these alerts are generated by detection rules that say, “This might be something worth looking at, this might be something worth looking at.” And you get dozens or hundreds of these. And what we’ve done is we’ve taken all these alerts and all the context about your environment and we pass that into the LLM, and what we do automatically is filter out all the false positives and say, “This is a false positive,” “This is a false positive,” “These are the ones that you should care about,” and then we map out the attack chain. So we show the attack path. And we show this to analysts, they get emotional. They go like, “What? That’s eight hours of my day that you’ve just automatically done for me.”
Keith Townsend: Which is an amazing innovation, right? People keep asking, “Is AI taking jobs?” No, we’re getting more and more work.
Ken Exner: It’s productivity.
Keith Townsend: And it’s productivity, right?
Ken Exner: It is making every practitioner and expert practitioner, it is helping everyone level up and be more productive. If you look at the security space or the DevOps space, it’s hard to get people with a lot of skills that understand. So, you can essentially use an LLM to augment that and give you the skills that you need to be an expert practitioner.
Keith Townsend: All right, last 30 seconds. What are you excited about in 2025?
Ken Exner: I am excited about how we’re going to continue providing abstractions that make things easier. So I hinted at some of these things how we made a semantic search really, really easy for our customers. And I hinted at how we’re starting to make conversational search easy for our customers, to go from text-based search, to semantic search, to conversational search. You’re going to see more there. We’re going to make it easier and easier for people to start adopting RAG, or start adopting conversational search through these abstractions that really, really simplify things. But still give you the power of the underlying primitives. I’m also excited about the things that we’re doing to automate the experience of DevOps practitioners and security analysts. You’re going to see much more there. We believe that the observability and security space are going to be fundamentally transformed by generative AI, and we’re running straight into it. We’re going to try automating everything we can to make people super productive in those spaces.
Keith Townsend: Ken, I’ve really enjoyed this conversation. To have this conversation about the high level business value of Elasticsearch, down to talking about LangChain and RAG. I don’t get that opportunity too much, shows your engineering and product chops. If you’re unsure what some of these terms meant, we have a wide range of coverage talking about the business value that customers are getting here at AWS re:Invent, along with their partners like Elasticsearch, and the technical details around these innovations. Stay tuned for more coverage from AWS re:Invent. I’m your host, Keith Townsend, for Six Five On The Road.
Author Information
Keith Townsend is a technology management consultant with more than 20 years of related experience in designing, implementing, and managing data center technologies. His areas of expertise include virtualization, networking, and storage solutions for Fortune 500 organizations. He holds a BA in computing and an MS in information technology from DePaul University. He is the President of the CTO Advisor, part of The Futurum Group.