Analyst(s): Brad Shimmin, Stephen Foskett
Publication Date: March 18, 2025
Forget the storage wars. VAST Data’s new platform update isn’t just about storing data more efficiently. The company is now taking aim at supercharging agentic AI with real-time insights at scale—any scale.
What is Covered in this Article:
- VAST Data has made several enhancements to its VAST Data Platform (now at version 5.3), positioning it as a unique system capable of securely unifying structured and unstructured data, particularly in support of complex, agentic AI solutions that must operate in real-time and at scale.
- Product updates to VAST Data Platform focus on VAST DataBase, where vector storage improvements now allow for real-time search and retrieval at nearly any scale across operational, analytical, and vector workloads. This was added to Kafka API support and was introduced in February 2025.
- Further, fundamental improvements include better analytics and auditing capabilities, stronger integration with S3 and NFS formats, several security and multitenancy capabilities such as unified row- and column-level permissions, and new data streaming capabilities via a new Apache Kafka-compatible event broker.
- The company’s latest platform updates set the stage for future innovations centered on its end-to-end AI pipeline solution, VAST InsightEngine. For example, new Kubernetes-friendly serverless triggers and functions in VAST DataBase enable real-time data provisioning for GPUs within any cloud-native environment.
- VAST added support for block storage in VAST DataStore in February, achieving the goal of “Universal Storage” promised since the company’s founding. Although limited to NVMe over TCP at this time, the block offering boasts greater performance and scalability than rivals, thanks to deep integration with the overall VAST Data Platform.
The News: VAST Data has updated its platform to better support a wide range of analytics and AI workloads with real-time data processing and data retrieval at scale. The updated VAST Data Platform (now version 5.3) includes improvements across several capabilities, including vector storage, analytics, security, and data streaming. This further positions VAST Data as an innovator in the AI-ready data platform market, particularly for applications requiring low-latency access to large datasets.
VAST Data Takes on Agentic AI with a Major Platform Update
Analyst Take: VAST Data intends to disrupt the rapidly growing market for AI-ready data, namely vector data, as used in grounding large language models (LLMs). The company has unveiled several enhancements to its VAST Data Platform, positioning it as the first and only comprehensive data platform built for end-to-end AI pipelines.
VAST Data’s objective is to support very large-scale AI use cases where semantic search information is critical in grounding models in contextual, factual information and enabling multiple models to work together as with agentic AI solution architectures. This means handling real-time data ingestion, processing, and retrieval without incurring latency or imposing technical debt (e.g., operational complexities).
As Futurum has learned in surveying nearly 900 enterprise AI practitioners, such complexities can kill any project. In selecting an AI partner, for example, more than half of all practitioners surveyed named expertise their number one requirement (see Figure 1).
Figure 1: Decision Maker: Top AI Vendor Decision Criteria
Building agentic AI solutions, which typically depend upon timely access to accurate RAG data, may appear straightforward. Many tools and supportive vector stores on the market target this single aspect of a broader GenAI solution. However, even with highly automated tooling, selecting the most appropriate embedding model, tokenizing raw information for vectorization, and identifying the correct search algorithm used to find and deliver the best information to an LLM is anything but plug-and-play. Doing so at scale, where milliseconds can mean the difference between success and failure, can be especially challenging.
Nowhere is this more true than building out highly performant agentic AI systems, where frontier-scale reasoning models might work to orchestrate objectives across several intertwined layers of smaller agent models and supportive tools. The amount of RAG information and structured outputs that must be retrieved and passed between these numerous entities can quickly push any infrastructure to its limits. And that is presuming that the supportive RAG implementation has timely access to the most up-to-date vectorized representation of corporate data. As many practitioners have discovered, it’s relatively easy to vectorize information; the hard part is keeping that representation (e.g., its index) running in lock-step with the underlying data – and in a highly secure, governable manner.
Assessing the VAST Data Value Proposition
VAST Data recognizes this challenge and believes the best solution is to remove performance roadblocks and break down implementational complexities through a highly scalable, automated, and open end-to-end AI pipeline. Improvements to the VAST DataStore, VAST DataBase, and VAST DataEngine give operational, security, and integration improvements to support more enterprise workloads. This latest addition enables the vendor to accommodate massive amounts of vectorized data (trillions of vectors, according to VAST Data) in a straightforward, automated manner. This view aligns perfectly with the requirements of agentic AI, as it allows companies to build out massive, complex hardware clusters orchestrated by Kubernetes and featuring popular data ingestion tools like Apache Kafka. It’s important to note that VAST Data’s native streaming service (VAST Event Broker) is not just Kafka API-compatible. With EventBroker, the company fully intends to improve upon and remove the need to manage Kafka separately, eliminating a lot of operational overhead while simultaneously delivering unlimited scale and performance.
Of course, such openness is critical not just in attracting new customers but also in positioning the VAST DataBase as a better alternative to any vector databases on the market, be those stand-alone options from Chroma, Weaviate, and Pinecone or converged relational/columnar databases with built-in vector functionality. On the one hand, VAST Data has a much better story to tell here in scaling beyond the in-memory limitations of many pure-play offerings. On the other hand, however, VAST Data’s unique approach to building a genuinely AI-ready data platform demands a commitment to that platform, which may isolate the company’s AI-focused vector data from its broader operational and analytical data landscape.
This is the biggest complaint against stand-alone vector databases and the biggest opportunity for highly converged and vertically integrated hardware/software vendors such as Oracle, which can deliver highly performant vector data and corporate data together through the same Oracle Database and underlying Exadata hardware. As VAST Data moves forward, it will need to bring its database closer to established corporate data stores while also working to expand the scope and role of VAST DataBase itself.
The former problem can be solved through data integration, replication, and virtualization options. The latter will take some time with VAST Data working to broaden the applicability and appeal of VAST DataBase together with VAST InsightEngine. As we’ve seen across the data management and analytics marketplace, companies are looking for the ability to work with data wherever it resides and to do so with little or no friction. This is the value proposition VAST Data is working toward, primarily through its deepening partnership with NVIDIA to use that company’s NIM (neural inference microservices) to deploy AI models across data centers, clouds, and workstations.
What to Watch:
- How quickly will enterprises building on NVIDIA GPU-accelerated platforms adopt the integrated VAST Data/NVIDIA solution? Success will be tied to this integration and workflow automation.
- Can VAST Data deliver value across both large and small enterprise environments?
- How will similar moves from storage competitors (NetApp, Quobyte, Hitachi Vantara, CoreWeave, et al.) stack up? Note that any of these players also partner with both NVIDIA and hyperscalars to deliver AI data at scale.
- How will Databricks, Snowflake, and other cloud providers adapt their offerings to target unified, low-latency agentic AI?
- How well can VAST Data InsightEngine adapt to serve broader AI cases, particularly those with specialized data sources (IoT platforms, real-time sensor streams, etc.)?
- As support for the open source or community typically determines the viability/adoption of AI-enabling solutions, how will VAST Data look to build such relationships?
See the press release on VAST Data, adding new capabilities to deliver real-time agentic value.
Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.
Other insights from The Futurum Group:
Is VAST Data’s New InsightEngine the Solution for Enterprise RAG?
VAST Data 5.2: Advancing AI-Ready Storage with Performance and Resilience
The Battle of the AI Models ChatGPT 4.5 and Claude 3.7 – Six Five Webcast – Infrastructure Matters
CoreWeave files for IPO: Specialized AI Cloud Provider Eyes Next Phase of Growth
Pure Storage Q4 FY 2025: Double-Digit Revenue Growth Amid Strong Subscription Momentum
Nvidia Q4 FY 2025: AI Momentum Strengthens Despite Margin Pressures
Author Information
Brad Shimmin is Vice President and Practice Lead, Data and Analytics at Futurum. He provides strategic direction and market analysis to help organizations maximize their investments in data and analytics. Currently, Brad is focused on helping companies establish an AI-first data strategy.
With over 30 years of experience in enterprise IT and emerging technologies, Brad is a distinguished thought leader specializing in data, analytics, artificial intelligence, and enterprise software development. Consulting with Fortune 100 vendors, Brad specializes in industry thought leadership, worldwide market analysis, client development, and strategic advisory services.
Brad earned his Bachelor of Arts from Utah State University, where he graduated Magna Cum Laude. Brad lives in Longmeadow, MA, with his beautiful wife and far too many LEGO sets.
Stephen is the President of the Tech Field Day business unit for The Futurum Group. An active participant in the world of enterprise information technology, Stephen currently focuses on AI, edge, and cloud, and is a long-time voice in enterprise storage.
Stephen oversees the popular Tech Field Day event series, bringing panels of independent technical content creators together with leading companies in the industry. He also hosts the weekly Utilizing Tech podcast, and contributes to numerous podcasts, webcasts, and industry news reports.
A graduate of Worcester Polytechnic Institute, Stephen studied the impact of technology and society. He frequently travels to Silicon Valley for Field Day events and appears on-stage, in analyst and press panels, and behind the scenes at events around the world.