Menu

Dell Delivers Low-Cost LLM Updates by Retrieval Augmented Generation

Dell Delivers Low-Cost LLM Updates by Retrieval Augmented Generation

The News: The CTO Advisor team worked with Dell Technologies to showcase low-cost large language model (LLM) updates using retrieval augmented generation (RAG) with a single GPU. Ryan Shrout from Signal65 Labs joined a discussion of the impact and importance of RAG in the business use of LLMs.

Dell Delivers Low-Cost LLM Updates by Retrieval Augmented Generation

Analyst Take: RAG is a way to update an LLM AI without retraining it. RAG takes a trained LLM and supplements it with task-specific content, such as up-to-date financial information. This supplemental information can be readily updated without changing the LLM, potentially updated in near-real-time, so that your application can deliver up-to-date AI insights.

Foundational LLMs are created by training using vast amounts of data, typically a complete crawl of the public Internet, such as the Common Corpus. Training with this vast amount of data requires a lot of resources and time, leading to multimillion-dollar costs. These models have intimate knowledge of the data they were trained on but total ignorance of anything outside it. A foundation LLM can provide answers on many topics such as a massive encyclopedia. Like a printed encyclopedia, a foundation LLM does not add new knowledge over time or go extremely deep on every topic. A customized LLM can be built from a foundation model by training on newer or private data to produce an LLM with specialized knowledge to give specialized insights. Fine-tuning an LLM is less intensive than building the foundation model but requires extensive and expensive computing resources, making fine-tuning a necessarily infrequent activity.

RAG does not change the LLM but supplements the model with another data source. New or specialist data is vectorized, translated into a format that can be used with the LLM and stored in a vector database. Vectorization is a simple and fast process, requiring far fewer resources and costs than fine-tuning with the same input data. This vectorized data is then used to supplement the LLM’s knowledge when a question is posed to the LLM. Vectorization can be a frequent and routine process that updates the vector database and makes updated information available as low-cost LLM updates.

The CTO Advisor team worked with Dell to demonstrate the practicality of using RAG without requiring a vast upgrade to the hardware platform for the application. What is a more modest hardware specification than a farm of servers with huge GPUs? How about a single laptop with a GPU? The demonstration used the Llama3 foundation LLM, augmented with Dell technical documentation. The LLM and the whole augmentation process were run on a single laptop, including vectorizing the Dell documentation and querying the LLM for information from its built-in knowledge and augmented information. Asking an unmodified foundation model about recent software such as vSphere 8.0 yielded an answer that when the model was trained there was no such product version. Asking the same questions of a foundation model with Dell server documentation as augmentation yielded a helpful answer, in this case, the specific minimum version of Dell BIOS needed for a successful upgrade to vSphere 8.0U1. The few minutes of vectorizing Dell documents using a single GPU upgraded the LLM’s knowledge of Dell servers by 2 years.

RAG becomes even more helpful when you need to control that augmentation data; it may be proprietary or regulated. You have more control of the augmentation data and can prevent data leakage when it is not trained into the LLM itself. RAG also allows the LLM to identify the specific augmentation source that provided the answer rather than the usual black-box nature of LLMs where answers are not easily attributed to sources.

The significant value of RAG is its ability to update a foundation model with new information, providing low-cost LLM updates. The Dell servers you already have in your data center may be sufficient for your AI application needs with RAG. A small number of GPUs can be effective at providing up-to-date inference using a foundation model and RAG. Naturally, RAG is not the solution to every LLM problem. There are plenty of use cases for fine-tuning either alone or with RAG used between fine-tuning updates.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other Insights from The Futurum Group:

Dell Technology Earnings

Dell Rolls Out Its 2024 Partner Program

The Evolving Role of Developers in the AI Revolution

Author Information

Alastair has made a twenty-year career out of helping people understand complex IT infrastructure and how to build solutions that fulfil business needs. Much of his career has included teaching official training courses for vendors, including HPE, VMware, and AWS. Alastair has written hundreds of analyst articles and papers exploring products and topics around on-premises infrastructure and virtualization and getting the most out of public cloud and hybrid infrastructure. Alastair has also been involved in community-driven, practitioner-led education through the vBrownBag podcast and the vBrownBag TechTalks.

Related Insights
CIO Take Smartsheet's Intelligent Work Management as a Strategic Execution Platform
December 22, 2025

CIO Take: Smartsheet’s Intelligent Work Management as a Strategic Execution Platform

Dion Hinchcliffe analyzes Smartsheet’s Intelligent Work Management announcements from a CIO lens—what’s real about agentic AI for execution at scale, what’s risky, and what to validate before standardizing....
Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth
December 22, 2025

Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth?

Keith Kirkpatrick, Research Director with Futurum, shares his insights on Zoho’s latest finance-focused releases, Zoho Spend and Zoho Billing Enterprise Edition, further underscoring Zoho’s drive to illustrate its enterprise-focused capabilities....
NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy
December 16, 2025

NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy

Nick Patience, AI Platforms Practice Lead at Futurum, shares his insights on NVIDIA's release of its Nemotron 3 family of open-source models and the acquisition of SchedMD, the developer of...
Will a Digital Adoption Platform Become a Must-Have App in 2026?
December 15, 2025

Will a DAP Become the Must-Have Software App in 2026?

Keith Kirkpatrick, Research Director with Futurum, covers WalkMe’s 2025 Analyst Day, and discusses the company’s key pillars for driving success with enterprise software in an AI- and agentic-dominated world heading...
Broadcom Q4 FY 2025 Earnings AI And Software Drive Beat
December 15, 2025

Broadcom Q4 FY 2025 Earnings: AI And Software Drive Beat

Futurum Research analyzes Broadcom’s Q4 FY 2025 results, highlighting accelerating AI semiconductor momentum, Ethernet AI switching backlog, and VMware Cloud Foundation gains, alongside system-level deliveries....
Oracle Q2 FY 2026 Cloud Grows; Capex Rises for AI Buildout
December 12, 2025

Oracle Q2 FY 2026: Cloud Grows; Capex Rises for AI Buildout

Futurum Research analyzes Oracle’s Q2 FY 2026 earnings, highlighting cloud infrastructure momentum, record RPO, rising AI-focused capex, and multicloud database traction driving workload growth across OCI and partner clouds....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.