The News: Qualcomm envisions the future of AI as hybrid with on-device AI playing a key role enabling generative AI to scale. Read the Qualcomm blog here.

The Future of AI is Hybrid: Look No Further than Your Devices to Scale Generative AI

Analyst Take: A hybrid AI architecture distributes and coordinates AI workloads among cloud and edge devices, rather than processing in the cloud only. The cloud and edge devices, including smartphones, automobiles, personal computers, and Internet of Things (IoT) devices, work together to provide more powerful, efficient, and highly optimized AI.

Massive generative AI models with billions of parameters place substantial demands on computing infrastructure. As such, both AI training, which learns the parameters for an AI model, and AI inference, which executes the model, have been limited to cloud implementations for massive and intricate models. However, I see that changing rapidly now.

I anticipate that the scale of AI inference is poised to be dramatically higher than that of AI training. While training individual models requires significant resources, larger generative AI models are expected to be trained only a few times annually. Notably, the cost of inferencing with such models increases in accord with the number of daily active users and their frequency of use. Running inference in the cloud results in exorbitant costs that can prove unsustainable for scaling.

Hybrid AI provides the answer akin to traditional computing’s evolution from mainframes and thin clients to a mix of cloud infrastructure and smart devices including PCs and smartphones. Hybrid AI is essential to the affordable scaling of consumer and enterprise use cases that are emerging from generative AI. Foundation models, such as general-purpose large language models (LLMs) like Generative Pre-trained Transformer 4 (GPT-4) and Language Model for Dialog Applications (LaMDA), have attained breakthrough levels of language comprehension, generation capabilities, and vast knowledge. Most of these models are highly massive with 100 billion+ parameters.

Google, for instance, continues to enhance LaMDA so that Google Bard, which uses AI to generate more conversational, contextual, and informative web search results for users, can improve web search by drawing on information across the Internet to provide deeper, mode contextual query results for users. At Google I/O 2023, Google introduced PaLM2, the company’s next generation language model. In relation to PaLM 1, PaLM 2 is more trained on multilingual text, spanning more than 100 languages to boost understanding, generation, and translation of nuanced text such as idioms, poems, and riddles.

Generative AI Use Cases Rising Amid Device Categories

From my perspective, hybrid AI architecture can enable generative AI to deliver augmented and new user experiences. For instance, with over 10 billion searches daily, and mobile accounting for over 60% of searches, the expansion of generative AI will fuel a considerable increment in the computing capacity required, especially from queries originating from smartphones. The growing popularity of chat as a search interface, along with generative AI-based search, are ready to boost the number of overall queries. As chat improves, the smartphone can also perform more capably as a digital assistant.

Now users can communicate naturally to gain more informative interactions, due to the accuracy of on-device personas and the LLMs comprehending text, voice, images, video, and other evolving inputs. Smartphone models that perform language processing, image understanding, text-to-text generation, and more, will likely be more in demand for quite some time.

For IoT, AI is already used in a wide array of IoT market segments, including retail, security, energy, supply chain, and asset management. Generative AI can benefit IoT segments by improving customer and workforce experience. In retail, for example, store managers can better plan for off-cycle sales opportunities based on upcoming events such as major sporting events and cultural festivals.

Additionally, I expect that the operations teams throughout the energy and utilities segment can use generative AI to optimize corner case load scenarios and better predict spikes in energy demand as well as the potential for grid diminishment and breakdowns. Plus, generative AI can enhance customer service in areas such as billing and outage updates.

Key Takeaways: The Future of AI is Hybrid

In the same manner organizations are expanding their adoption of hybrid cloud, the digital ecosystem is fast embracing hybrid AI to meet the surging resource and scaling demands of AI including generative AI. Remarkably, Qualcomm cites that AI models with more than 1 billion parameters are already running on phones. Equally important, the performance and accuracy levels are comparable to those across the cloud. From my view, the hybrid AI approach currently extends to all the major AI applications and device segments, including smartphones, IoT, laptops, and vehicles and will only increasingly expand. The future is now for hybrid AI.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

Qualcomm Revenue in Q2 Hits $9.27B, Beating Analyst Estimates

Qualcomm Uplifts WiFi 7 through Mesh Networking Performance Optimization

Qualcomm Snapdragon 8 Gen 2 Powers ASUS ROG 7 Mobile Gaming Phones

Author Information

Ron Westfall

Ron is an experienced, customer-focused research expert and analyst, with over 20 years of experience in the digital and IT transformation markets, working with businesses to drive consistent revenue and sales growth.

Ron holds a Master of Arts in Public Policy from University of Nevada — Las Vegas and a Bachelor of Arts in political science/government from William and Mary.

Analyze

Data & Intelligence

Advise

Research & Advisory

Amplify

Content & Campaigns

Assess

Testing, Labs & Validation

Practice Areas

Featured Insights

Futurum Research 2026: Key Issues and Predictions

2026 Research Agenda: Key Topics and Coverage Areas

Insights

Premium Insights

Newsletter

Media Partners

Podcasts

Video Series

Featured Insights

Can Parallel Retrieval Redefine Enterprise AI Search Speed and Quality?

Will Glean’s NVIDIA Nemotron 3 Ultra Integration Shift the Enterprise AI Stack?

Futurum Group

Portfolio Companies

Featured Insights

Can Parallel Retrieval Redefine Enterprise AI Search Speed and Quality?

Will Glean’s NVIDIA Nemotron 3 Ultra Integration Shift the Enterprise AI Stack?

Trusted by 100+ industry leaders

Featured Case Study

Scaling Smarter: How Google Cloud Marketplace Is Reshaping Partner Sales and GTM Strategy

Maximizing ROI with Agentic AI: Why Agentforce Is the Fast Path to Enterprise Value

Futurum and Kearney Reveal CEOs’ Readiness for AI Transformation in Landmark Study

The Future of AI is Hybrid: Look No Further than Your Devices to Scale Generative AI

The Future of AI is Hybrid: Look No Further than Your Devices to Scale Generative AI

Generative AI Use Cases Rising Amid Device Categories

Key Takeaways: The Future of AI is Hybrid

Other insights from The Futurum Group:

Author Information

Welcome to The Futurum Group

Book a Demo

Newsletter Sign-up Form

Thank you, we received your request, a member of our team will be in contact with you.