Austin, Texas, USA, February 25, 2026

Futurum Survey Finds That Inference at Scale Is the Primary Workload for Only 35% of Enterprises

Enterprise AI techniques have become so specialized that no single workload type is primary for the majority of enterprise decision-makers, according to new research from Futurum.

The balanced distribution across four distinct workload categories—inference, foundation model training, domain-specific training, and fine-tuning—signals that enterprises may reject one-size-fits-all GPU procurement strategies in favor of heterogeneous infrastructure optimized for distinct computational profiles.

The research surveyed AI infrastructure decision-makers across global enterprises with annual revenues exceeding $100 million, capturing granular workload distribution data for production AI systems. The findings reveal that inference at scale represents 34.6% of enterprise AI compute consumption, training large foundation models accounts for 24.9%, training domain-specific models consumes 23.3%, and fine-tuning existing models utilizes 17.2%. The absence of a dominant workload category reveals the intermediate stage between training and inference that many AI innovators find themselves in.

Figure 1: Primary Enterprise AI Workload Percentage

Brendan Burke, Research Director at Futurum, said, “No single workload standing out as the primary method of enterprise AI shows that there will be a future for workload-optimized silicon. Training foundation models, serving inference, and fine-tuning custom models have completely different performance bottlenecks, and the market is finally waking up to the idea that workload-optimized silicon will deliver better TCO than throwing frontier data centers at every problem.”

The research reveals several key developments shaping the AI software landscape:

The largest AI clusters for more than half (57%) of organizations are less than 2,048 accelerators
38% of organizations primarily access data center compute via hardware capital expenditures
Leading performance metrics for AI clusters include training speed (31%), $ per FLOP or tokens/second/$ (22%), and FLOPs per watt (16%)

“Fine-tuning and domain-specific model training represent 40.5% of primary enterprise AI workloads combined, yet most accelerator software stacks are optimized for frontier-scale pre-training and inference,” Burke observed, “Fine-tuning workloads have unique memory access patterns and require native LoRA kernel support. Without software innovation for these workloads, compute vendors will underperform on nearly half of enterprise production workloads.”

The research suggests that open-source AI frameworks and model deployment may become leading enterprise investment priorities, with decision-makers citing open-source integration as critical to avoiding vendor lock-in and maintaining flexibility across heterogeneous workloads. To leverage the reasoning capabilities of open-source models, agentic AI deployment is emerging as a distinct fifth workload category that doesn’t neatly fit into training or inference paradigms. Semiconductor and neocloud product roadmaps can benefit from addressing these evolving enterprise needs.

About Futurum Intelligence for Market Leaders

Futurum Intelligence’s Semiconductor, Supply Chain, and Emerging Tech IQ service provides actionable insight from analysts, reports, and interactive visualization datasets, helping leaders drive their organizations through transformation and business growth. Subscribers can log into the platform at https://app.futurumgroup.com/, and non-subscribers can find additional information at Futurum Intelligence.

Follow news and updates from Futurum on X and LinkedIn using #Futurum. Visit the Futurum Newsroom for more information and insights.

Other Insights from Futurum:

Will NVIDIA’s Meta Deal Ignite a CPU Supercycle?

Does Nebius’ Acquisition of Tavily Create the Leading Agentic Cloud?

Microsoft’s Maia 200 Signals the XPU Shift Toward Reinforcement Learning

Author Information

Brendan Burke

Brendan is Research Director, Semiconductors, Supply Chain, and Emerging Tech. He advises clients on strategic initiatives and leads the Futurum Semiconductors Practice. He is an experienced tech industry analyst who has guided tech leaders in identifying market opportunities spanning edge processors, generative AI applications, and hyperscale data centers.

Before joining Futurum, Brendan consulted with global AI leaders and served as a Senior Analyst in Emerging Technology Research at PitchBook. At PitchBook, he developed market intelligence tools for AI, highlighted by one of the industry’s most comprehensive AI semiconductor market landscapes encompassing both public and private companies. He has advised Fortune 100 tech giants, growth-stage innovators, global investors, and leading market research firms. Before PitchBook, he led research teams in tech investment banking and market research.

Brendan is based in Seattle, Washington. He has a Bachelor of Arts Degree from Amherst College.

Trusted by 100+ industry leaders

Featured Case Studies

Analyze

Data & Intelligence

Advise

Research & Advisory

Amplify

Content & Campaigns

Assess

Testing, Labs & Validation

Practice Areas

Featured Insights

Futurum Research 2026: Key Issues and Predictions

2026 Research Agenda: Key Topics and Coverage Areas

Insights

Premium Insights

Newsletter

Media Partners

Podcasts

Video Series

Featured Insights

NVIDIA Aims Vera Rubin at Agentic Post-Training With Proven CoreWeave Results

Futurum Group

Portfolio Companies

Trusted by 100+ industry leaders

Featured Case Study

Scaling Smarter: How Google Cloud Marketplace Is Reshaping Partner Sales and GTM Strategy

Maximizing ROI with Agentic AI: Why Agentforce Is the Fast Path to Enterprise Value

Futurum and Kearney Reveal CEOs’ Readiness for AI Transformation in Landmark Study

PRESS RELEASE

AI Workload Priorities Diversify as Enterprises Push Compute Beyond Training

Futurum Survey Finds That Inference at Scale Is the Primary Workload for Only 35% of Enterprises

About Futurum Intelligence for Market Leaders

Other Insights from Futurum:

Author Information

Brendan Burke

Benjamin Brown

Analyze

Data & Intelligence

Advise

Research & Advisory

Amplify

Content & Campaigns

Assess

Testing, Labs & Validation

Practice Areas

Featured Insights

Futurum Research 2026: Key Issues and Predictions

2026 Research Agenda: Key Topics and Coverage Areas

Insights

Premium Insights

Newsletter

Media Partners

Podcasts

Video Series

Featured Insights

NVIDIA Aims Vera Rubin at Agentic Post-Training With Proven CoreWeave Results

Azure’s AMD Partnership Expands: Is Reinforcement Learning the Hardware Bottleneck?

Futurum Group

Portfolio Companies

Trusted by 100+ industry leaders

Featured Case Study

Scaling Smarter: How Google Cloud Marketplace Is Reshaping Partner Sales and GTM Strategy

Maximizing ROI with Agentic AI: Why Agentforce Is the Fast Path to Enterprise Value

Futurum and Kearney Reveal CEOs’ Readiness for AI Transformation in Landmark Study

PRESS RELEASE

AI Workload Priorities Diversify as Enterprises Push Compute Beyond Training

Futurum Survey Finds That Inference at Scale Is the Primary Workload for Only 35% of Enterprises

About Futurum Intelligence for Market Leaders

Other Insights from Futurum:

Author Information

Brendan Burke

Welcome to The Futurum Group

Book a Demo

Welcome

Benjamin Brown

Newsletter Sign-up Form

Thank you, we received your request, a member of our team will be in contact with you.