Hammerspace Shows Storage Acceleration for AI Training at AI Field Day

Hammerspace Shows Storage Acceleration for AI Training at AI Field Day

Introduction

Training large language models (LLMs) is a big business requiring a lot of data to be fed to many servers. It is like conventional high-performance computing (HPC), where effort is spent tuning and matching the data transfer with the computing capabilities. The tuning ensures no bottlenecks lead to idle computing and, in turn, a longer time to complete training. Training a new LLM takes thousands of GPU-equipped servers weeks or months. Sam Altman stated that training GPT-4 cost over $100 million. It is worth making sure that money is well spent.

It was in this background that Hammerspace showed Hyperscale NAS accelerating underperforming scale-out NAS to provide storage acceleration for AI training. Hammerspace presented about storage acceleration for AI Training at AI Field Day 4. Hammerspace has some impressive NAS virtualization technologies, but their separation of metadata and data access impacted massive AI training projects for their customers. These customers found that their existing scale-out NAS solutions could not deliver the file access rate that their training required. It wasn’t that the storage was too slow, but that it took too long to find the correct file on the storage. Hammerspace Hyperscale NAS gathers the file metadata into a dedicated high-availability server cluster running their Anvil server and leverages standard NFSv4 features. The Hyperscale NAS does the metadata operation of identifying where the file exists, allowing the original NAS to serve the files directly to the compute servers. Hyperscale NAS does not require the existing NAS to operate as NFS v4; older V3 is just fine for accessing the NAS data. This Storage Acceleration for AI Training is conventionally achieved by having a proprietary client on the servers, which complicates deployment and may require moving data onto a newer NAS. Hammerspace is a significant contributor to the Linux NFS software and uses these contributed features to achieve faster file access without needing a custom client. All the features Hyperscale NAS requires are already in standard Linux distributions.

This Storage Acceleration of NFS3 file servers for AI Training is not the core function of Hammerspace; it is just a beneficial side effect for a specific use case. Hammerspace is far more widely applicable as a NAS virtualization solution, allowing unified, multi-protocol access to multiple NAS clusters across multiple physical locations, both on-premises and in the public cloud. These storage virtualization features require the Hyperscale NAS to be between the NAS clients and the existing NAS servers; the Data Services (DSX) component of Hyperscale NAS does this in a highly available, scale-out fashion. These capabilities were not the focus of the AI Field Day presentation but are undoubtedly valuable for complex enterprise NAS deployments.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other Insights from The Futurum Group:

Hammerspace Adds AWS SVP and LLM Training Architecture

Hammerspace Unveils Hyperscale NAS Addressing the AI/HPC Workloads

Hammerspace Global Data Environment Product Review

Author Information

Alastair has made a twenty-year career out of helping people understand complex IT infrastructure and how to build solutions that fulfil business needs. Much of his career has included teaching official training courses for vendors, including HPE, VMware, and AWS. Alastair has written hundreds of analyst articles and papers exploring products and topics around on-premises infrastructure and virtualization and getting the most out of public cloud and hybrid infrastructure. Alastair has also been involved in community-driven, practitioner-led education through the vBrownBag podcast and the vBrownBag TechTalks.

SHARE:

Latest Insights:

Daniel Newman sees 2025 as the year of agentic AI with the ability to take AI and create and hyperscale your business by maximizing and automating processes. Daniel relays to Patrick Moorhead that there's about $4 trillion of cost that can be taken out of the labor pool to drive the future of agentics.
On this episode of The Six Five Webcast, hosts Patrick Moorhead and Daniel Newman discuss Microsoft, Google, Meta, AI regulations and more!
Oracle’s Latest Exadata X11M Platform Delivers Key Enhancements in Performance, Efficiency, and Energy Conservation for AI and Data Workloads
Futurum’s Ron Westfall examines why Exadata X11M allows customers to decide where they want to gain the best performance for their Oracle Database workloads from new levels of price performance, consolidation, and efficiency alongside savings in hardware, power and cooling, and data center space.
Lenovo’s CES 2025 Lineup Included Two New AI-Powered ThinkPad X9 Prosumer PCs for Hybrid Workers
Olivier Blanchard, Research Director at The Futurum Group, shares his insights on how Lenovo’s new Aura Edition ThinkPad X9 prosumer PCs help the company maximize Intel’s new Core Ultra processors to deliver a richer and more differentiated AI feature set on premium tier Copilot+ PCs to hybrid workers.

Thank you, we received your request, a member of our team will be in contact with you.