Menu

How Lamini Optimized LLM Finetuning System on AMD GPUs

How Lamini Optimized LLM Finetuning System on AMD GPUs

The News: Lamini CTO Greg Diamos popped open the hood and gave a detailed look at the engine behind its AMD graphics processing unit (GPU)-driven LLM Superstation in a blog post. You can read the blog on the Lamini website.

How Lamini Optimized LLM Finetuning System on AMD GPUs

Analyst Take: Lamini raised eyebrows last month when the large language model (LLM) finetuning specialist revealed it built its LLM Superstation on AMD Instinct MI210 and MI250 accelerators rather than wait for NVIDIA H100s. The H100 GPUs have a 52-week lead time. LLM Superstation is available now, both in the cloud and on-premises.

At the time of announcement, Diamos said AMD’s ROCm GPU programming software has achieved parity with NVIDIA CUDA for LLMs. In his follow-up blog, Diamos disclosed how Lamini built its optimized finetuning system on AMD Instinct GPUs, and the technical hurdles it had to overcome.

Lamini started by running a single MI100 system last December before AMD donated two MI210 servers. The pre-seed startup set them up in a janitor’s closet with ventilation and sound insulation.

It eventually grew into a data center at CoreSite, with a typical configuration of 4 x MI250 GPU servers. That provided 512 GB of high-bandwidth memory (HBM), which can fit a ~200 billion parameter LLM in bfloat16. Lamini used shared NFS servers, about 400 TB, and a high-performance network between GPU servers. Upcoming AMD Instinct MI300X GPUs with 192 GB of HBM will allow further scaling.

Now Lamini is in phase two of its finetuning, which involves integrating 128 AMD MI200 GPUs into the platform. This phase requires using layers of specialized software inside ROCm as well as building new layers. These layers include the GPU, an AMDgpu Linux driver, and optimized libraries. The Lamini SDK sits on top of the layers, adding LLM use cases such as chat, classification, and autocomplete. The SDKs use Python and TypeScript application programming interface (API) clients for loading, querying, and finetuning LLMs.

Diamos said Lamini added new optimizations that accelerate LLMs and take advantage of unique capabilities of AMD’s MI platform. These optimizations enable hosting 200 billion parameter models on a single server and 10,000 finetuned language models on one server, handling 12,800 simultaneous requests to a single server and processing over 3.5 million queries per day on one node.

The way Diamos sees it, ROCm has “enormous potential” to go beyond CUDA for LLM finetuning, and Lamini and ROCm can efficiently finetune the largest LLMs, such as Meta AI’s Llama 2. If he is correct, the combination can significantly change the LLM training market.

As The Futurum Group colleague Mark Beccue wrote in September: “Despite some limitations, the AMD-Lamini Superstation is a viable option for enterprises to consider for deploying LLMs. Savvy enterprises that cannot wait a calendar year to run AI workloads will be test-driving the system.” It will not be easy for AMD or any other competitor to dent NVIDIA’s dominance in the GPU market, but AMD can at least give data scientists and IT teams a viable option.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other Insights from Futurum Group:

AMD and Hugging Face Team Up to Democratize AI Compute – Shrewd Alliance Could Lead to AI Compute Competition, Lower AI Costs

AMD Revenue Hits $5.4 Billion in Q2, Down 18% YoY, But Beats Estimates

NVIDIA,GlobalFoundries, Zoom, Salesforce, Groq, Qualcomm

Author Information

Dave focuses on the rapidly evolving integrated infrastructure and cloud storage markets.

Related Insights
SpaceX Acquires xAI: Rockets, Starlink, and AI Under One Roof
February 3, 2026

SpaceX Acquires xAI: Rockets, Starlink, and AI Under One Roof

Nick Patience, VP and Practice Lead for AI Platforms at Futurum, analyzes SpaceX's acquisition of xAI, examining the financial rationale, valuation metrics, and competitive implications for AI and satellite infrastructure...
Industrial AI Scales at IFS in FY 2025. Is Adoption Moving Beyond Pilots
February 3, 2026

Industrial AI Scales at IFS in FY 2025. Is Adoption Moving Beyond Pilots?

Futurum Research examines IFS’s FY 2025 results as Industrial AI adoption shifts from initial deployments to scaled operations, supported by 23% ARR growth, rising retention, and margin expansion....
SK Hynix Q4 FY 2025 Structural Shift to AI Memory Lifts Margins
February 2, 2026

SK Hynix Q4 FY 2025: Structural Shift to AI Memory Lifts Margins

Futurum Research analyzes SK Hynix’s Q4 FY 2025 results, highlighting HBM leadership, DDR5 server mix, and NAND roadmap, and why capacity, packaging, and customer alignment position the company for sustained...
SAP Q4 FY 2025 Earnings Cloud ERP Strength, AI Traction
February 2, 2026

SAP Q4 FY 2025 Earnings: Cloud ERP Strength, AI Traction

Futurum Research analyzes SAP’s Q4 FY 2025 earnings, highlighting Cloud ERP growth, AI and data cloud attach, and how deal mix and sovereignty considerations shaped near-term backlog while supporting multi-year...
Meta Q4 FY 2025 Results Underscore AI-Fueled Ads Momentum
January 30, 2026

Meta Q4 FY 2025 Results Underscore AI-Fueled Ads Momentum

Futurum Research analyzes Meta’s Q4 FY 2025 earnings, focusing on AI-driven ads gains, stronger Reels and Threads engagement, and how 2026 infrastructure spend and messaging commerce shape enterprise AI strategy....
IBM Q4 FY 2025 Software and Z Cycle Lift Growth and FCF
January 30, 2026

IBM Q4 FY 2025: Software and Z Cycle Lift Growth and FCF

Futurum Research analyzes IBM’s Q4 FY 2025, highlighting software acceleration, the IBM Z AI cycle, and AI-driven productivity and M&A synergies supporting margin expansion and higher FY 2026 free cash...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.