Menu

Cohesity Gaia Uses RAG to Unlock Valuable Secondary Data

Cohesity Gaia Uses RAG to Unlock Valuable Secondary Data

The News: Cohesity introduces Gaia, an AI-powered conversational assistant. Additional detail is available in Cohesity’s press release.

Cohesity Gaia Uses RAG to Unlock Valuable Secondary Data

Analyst Take: With the hype around generative AI, it is easy to lose sight of the fact that the content generated by AI engines is only as factual and reliable as the data that it is based upon. For this reason, retrieval-augmented generation (RAG) is emerging as an important tool for improving the accuracy and factual consistency of the large language models (LLMs) that are being used to generate responses. From this standpoint, it augments and enhances generative AI with private, corporate data, while retaining the ability to manage the security of, and access to, that information.

Why RAG for Secondary Data?

Specifically, RAG can pull from a variety of knowledge sources, effectively allowing LLMs to “look things up” before answering an inquiry. For example, it can be used to identify files that are relevant to a business inquiry, inspect the content of these documents and files, and then use this information to generate a response. As a result, the subsequent response is more reliable and grounded in real-world knowledge.

While structured and unstructured primary data stores have largely been the focus of generative AI to date, they are only the tip of the iceberg. Secondary data contained in emails, files, and virtual machines represent significantly more data. This data is a massive and untapped opportunity from an AI standpoint.

What to Look for in a Solution for RAG-Enhanced AI

When looking at an architecture for AI that uses RAG, the ability to bring compute resources to the data is important for a number of factors that include optimizing performance, bandwidth and costs while minimizing latency. This is where a distributed architecture can come in, adding the benefits of improved resource utilization by spreading compute resources and increasing fault tolerance for resiliency.

Additionally, data integrity and security need to be managed. This management is especially critical considering the sensitive nature of the data that may need to be utilized to substantiate answers to business inquiries. A decentralized architecture allows for data to be processed locally, avoiding risks inherent in moving data to a centralized server. What’s more, responsible access to data must be enforced with capabilities such as role-based access control (RBAC) and by embracing a zero-trust approach. Finally, supporting API extensibility facilitates the ability to adapt to a range of diverse functionalities and capabilities that may be required to support experimentation and innovation and to integrate with existing infrastructure and workflows.

Introducing Cohesity Gaia

For its part, Cohesity aims to provide what it describes as an “easy button” to adopting RAG AI for secondary data with Gaia. With its Data Cloud offering, Cohesity has already been working to offer an end-to-end platform for data protection, security, mobility, access, and insights, that is based on its scalable, distributed file system and that uses its data indexing capabilities. Gaia will provide an AI-powered conversational interface that allows users to gain contextual and valuable LLM-based insights from enterprise data, with Cohesity hosting the LLM vector database, initially in the form of Azure OpenAI with others to follow. While initially supporting Microsoft 365 and OneDrive, Cohesity aims to expand to support other data types including unstructured NAS backups.

To facilitate responsible and secure utilization of data—which is an important and growing concern as customers increasingly adopt all forms of AI—the customer determines which data is indexed by Gaia. This approach helps to avoid any data being indexed that should not be, whether for privacy, security, or compliance concerns. Additionally, content filtering, configurable guardrails, and RBAC controls the data that specific users can access.

An example use case is streamlining compliance using generative AI. Data is indexed based on the Cohesity backups, and responses are generated based on generative AI, with RAG enhancing the subsequent output by facilitating access to corporate data and LLMs for enhanced business context. A multi-turn chat interface allows users to do further investigative digging. For example, a user might ask for the presence of patient names and treatment plans that may have been exposed over a certain period of time, and then dig into specific examples that are uncovered.

Conclusion and Looking Ahead

Cohesity is the first mover when it comes to bringing RAG to secondary data. In allowing users to ask business questions and obtain a context-rich response, using RAG AI and LLM to search data, identify relevant information, and generate a response, this is a game-changer in terms of unlocking value from the exabytes of secondary and protection data that is in existence today and growing exponentially. Key customer outcomes that The Futurum Group anticipates include improving the speed and accuracy of decision making and streamlining risk compliance and risk management. As table stakes criteria, the solution incorporates key features required to facilitate safe and responsible access to and utilization of indexed data. The solution is SaaS-based and includes a free trial option, which will support customer adoption.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other Insights from The Futurum Group:

Cohesity Acquires Veritas Data Protection Assets

Early Access for Cohesity Turing Integration with Amazon Bedrock

Cohesity at AWS re:Invent 2023 – The Six Five on the Road

Author Information

Russ brings over 25 years of diverse experience in the IT industry to his role at The Futurum Group. As a partner at Evaluator Group, he built the highly successful lab practice, including IOmark benchmarking.

Prior to Evaluator Group he worked as a Technology Evangelist and Storage Marketing Manager at Sun Microsystems. He was previously a technologist at Solbourne Computers in their test department and later moved to Fujitsu Computer Products. He started his tenure at Fujitsu as an engineer and later transitioned into IT administration and management.

Russ possesses a unique perspective on the industry through his experience as both a product marketing and IT consumer.

A Colorado native, Russ holds a Bachelor of Science in Applied Math and Computer Science from University of Colorado, Boulder, as well as a Master of Business Administration in International Business and Information Technology from University of Colorado, Denver.

Krista Case brings over 15 years of experience providing research and advisory services and creating thought leadership content. Her vantage point spans technology and vendor portfolio developments; customer buying behavior trends; and vendor ecosystems, go-to-market positioning, and business models. Her work has appeared in major publications including eWeek, TechTarget and The Register.

Related Insights
CIO Take Smartsheet's Intelligent Work Management as a Strategic Execution Platform
December 22, 2025

CIO Take: Smartsheet’s Intelligent Work Management as a Strategic Execution Platform

Dion Hinchcliffe analyzes Smartsheet’s Intelligent Work Management announcements from a CIO lens—what’s real about agentic AI for execution at scale, what’s risky, and what to validate before standardizing....
Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth
December 22, 2025

Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth?

Keith Kirkpatrick, Research Director with Futurum, shares his insights on Zoho’s latest finance-focused releases, Zoho Spend and Zoho Billing Enterprise Edition, further underscoring Zoho’s drive to illustrate its enterprise-focused capabilities....
NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy
December 16, 2025

NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy

Nick Patience, AI Platforms Practice Lead at Futurum, shares his insights on NVIDIA's release of its Nemotron 3 family of open-source models and the acquisition of SchedMD, the developer of...
Will a Digital Adoption Platform Become a Must-Have App in 2026?
December 15, 2025

Will a DAP Become the Must-Have Software App in 2026?

Keith Kirkpatrick, Research Director with Futurum, covers WalkMe’s 2025 Analyst Day, and discusses the company’s key pillars for driving success with enterprise software in an AI- and agentic-dominated world heading...
Broadcom Q4 FY 2025 Earnings AI And Software Drive Beat
December 15, 2025

Broadcom Q4 FY 2025 Earnings: AI And Software Drive Beat

Futurum Research analyzes Broadcom’s Q4 FY 2025 results, highlighting accelerating AI semiconductor momentum, Ethernet AI switching backlog, and VMware Cloud Foundation gains, alongside system-level deliveries....
Oracle Q2 FY 2026 Cloud Grows; Capex Rises for AI Buildout
December 12, 2025

Oracle Q2 FY 2026: Cloud Grows; Capex Rises for AI Buildout

Futurum Research analyzes Oracle’s Q2 FY 2026 earnings, highlighting cloud infrastructure momentum, record RPO, rising AI-focused capex, and multicloud database traction driving workload growth across OCI and partner clouds....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.