Search

Hammerspace Shows Storage Acceleration for AI Training at AI Field Day

Hammerspace Shows Storage Acceleration for AI Training at AI Field Day

Introduction

Training large language models (LLMs) is a big business requiring a lot of data to be fed to many servers. It is like conventional high-performance computing (HPC), where effort is spent tuning and matching the data transfer with the computing capabilities. The tuning ensures no bottlenecks lead to idle computing and, in turn, a longer time to complete training. Training a new LLM takes thousands of GPU-equipped servers weeks or months. Sam Altman stated that training GPT-4 cost over $100 million. It is worth making sure that money is well spent.

It was in this background that Hammerspace showed Hyperscale NAS accelerating underperforming scale-out NAS to provide storage acceleration for AI training. Hammerspace presented about storage acceleration for AI Training at AI Field Day 4. Hammerspace has some impressive NAS virtualization technologies, but their separation of metadata and data access impacted massive AI training projects for their customers. These customers found that their existing scale-out NAS solutions could not deliver the file access rate that their training required. It wasn’t that the storage was too slow, but that it took too long to find the correct file on the storage. Hammerspace Hyperscale NAS gathers the file metadata into a dedicated high-availability server cluster running their Anvil server and leverages standard NFSv4 features. The Hyperscale NAS does the metadata operation of identifying where the file exists, allowing the original NAS to serve the files directly to the compute servers. Hyperscale NAS does not require the existing NAS to operate as NFS v4; older V3 is just fine for accessing the NAS data. This Storage Acceleration for AI Training is conventionally achieved by having a proprietary client on the servers, which complicates deployment and may require moving data onto a newer NAS. Hammerspace is a significant contributor to the Linux NFS software and uses these contributed features to achieve faster file access without needing a custom client. All the features Hyperscale NAS requires are already in standard Linux distributions.

This Storage Acceleration of NFS3 file servers for AI Training is not the core function of Hammerspace; it is just a beneficial side effect for a specific use case. Hammerspace is far more widely applicable as a NAS virtualization solution, allowing unified, multi-protocol access to multiple NAS clusters across multiple physical locations, both on-premises and in the public cloud. These storage virtualization features require the Hyperscale NAS to be between the NAS clients and the existing NAS servers; the Data Services (DSX) component of Hyperscale NAS does this in a highly available, scale-out fashion. These capabilities were not the focus of the AI Field Day presentation but are undoubtedly valuable for complex enterprise NAS deployments.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other Insights from The Futurum Group:

Hammerspace Adds AWS SVP and LLM Training Architecture

Hammerspace Unveils Hyperscale NAS Addressing the AI/HPC Workloads

Hammerspace Global Data Environment Product Review

Author Information

Alastair has made a twenty-year career out of helping people understand complex IT infrastructure and how to build solutions that fulfil business needs. Much of his career has included teaching official training courses for vendors, including HPE, VMware, and AWS. Alastair has written hundreds of analyst articles and papers exploring products and topics around on-premises infrastructure and virtualization and getting the most out of public cloud and hybrid infrastructure. Alastair has also been involved in community-driven, practitioner-led education through the vBrownBag podcast and the vBrownBag TechTalks.

SHARE:

Latest Insights:

In a discussion that spans significant financial movements and strategic acquisitions to innovative product launches in cybersecurity, hosts Camberley Bates, Krista Macomber, and Steven Dickens share their insights on the current dynamics and future prospects of the industry.
The New ThinkCentre Desktops Are Powered by AMD Ryzen PRO 8000 Series Desktop Processors
Olivier Blanchard, Research Director at The Futurum Group, shares his insights about Lenovo’s decision to lean into on-device AI’s system improvement value proposition for the enterprise.
Steven Dickens, Vice President and Practice Lead, at The Futurum Group, provides his insights into IBM’s earnings and how the announcement of the HashiCorp acquisition is playing into continued growth for the company.
New Features Designed to Improve CSAT, Increase Productivity, and Accelerate Deal Cycles
Keith Kirkpatrick, Research Director with The Futurum Group, covers new AI features being embedded into Oracle Fusion Cloud CX with the goal of helping workers improve efficiency and engagement levels across sales, marketing, and support.