Qualcomm-Meta Llama 2 Could Unleash LLM Apps at the Edge

Qualcomm-Meta Llama 2 Could Unleash LLM Apps at the Edge

The News: On July 18, Qualcomm announced that it is working with Meta to implement Llama 2-based AI capabilities on smartphones and PCs starting in 2024. The two companies are working to optimize the execution of Meta’s newest LLM directly on device, without relying on the (note word here) sole use of cloud services. The vision is to be able to enable the creation of powerful generative AI use cases and applications. Developers can start creating applications for these devices today, leveraging the Qualcomm AI Stack, a set of tools designed process AI more efficiently on Snapdragon.

Read the full announcement on the Qualcomm website.

The announcement is further proof of Qualcomm’s investment and vision for AI – that a significant portion of AI applications will run on edge devices that leverage both local and cloud compute.

Read the Qualcomm whitepaper, “The future of AI is hybrid” here.

Qualcomm-Meta Llama 2 Could Unleash LLM Apps at the Edge

Analyst Take: The idea that LLM applications could be run on compute-cost efficient edge devices is enough to make most AI application developers’ imaginations run wild. This capability would bring lots of potential new use cases and business opportunities. Qualcomm and Meta are building the pathway to LLM apps at the edge. Here is a look at the why, the how, and the impact LLMs at the edge could have via Qualcomm-Llama 2.

The Market Drivers for Edge AI

Moving AI compute to the edge has two big potential advantages over cloud AI compute – lower latency and lower cost. If an edge device can handle an AI workload locally, there is no cloud compute cost. Latency drops because there is no lag in the compute. When you consider the compute cost for AI, especially for generative AI and LLMs, moving it offline to local compute has massive appeal and opens up a lot more AI opportunities.

The Market Barriers for Edge AI

The market barriers for edge AI are:

  • Compute and memory constraints – Which makes it very hard to run large AI apps.
  • Asymmetry – Edge devices are varied in size and shape, capabilities, and limitations. That makes it difficult for application developers to build AI applications that will run on a broad range of devices.
  • Security and privacy – Most edge devices are connected devices, which are therefore exposed to cyber-attacks.

Concept: The Lightweight LLM

Some LLM players have thought about the promise of Edge AI and the challenge they present for AI in compute. The solution has been to build LLMs that use less compute but deliver similar results in creative ways. Google created the Gecko Edition of the PaLM 2 model with that idea in mind. Another is Meta’s Llama models.

Under the Hood of Making LLMs Lightweight

Lightweight LLMs leverage model compression to optimize for edge devices. There are three main techniques: knowledge distillation, quantization, and pruning.

  • Pruning – A technique that removes redundant and inconsequential parameters, such as connectors, neurons, channels, or layers.
  • Knowledge distillation – A technique where a smaller model is trained to mimic the behavior of a larger model on a smaller data set.
  • Quantization – A technique where the model’s weights and activation accuracy are reduced without significantly impacting the model’s overall accuracy.

Bringing It Back to Hybrid

While leveraging lightweight LLM models locally can make an impact at the edge, Qualcomm’s concept of hybrid is the approach that makes the most sense for generative AI/LLM apps at the edge. AI compute loads that make sense to process locally are processed locally while other, likely larger AI compute loads are processed in the cloud. Edge AI then gets to benefit from some lower costs and latency of local compute but are still able to leverage the more robust compute power of the cloud to deliver potent LLM apps.

Conclusions

At first blush, the idea of embedding Llama 2 in edge devices seems far-fetched, but if you consider the model compression techniques available to make LLMs more lightweight, combined with the hybrid edge/cloud approach, the path to unleashing a new wave of generative AI apps at the edge has real potential. The end of 2024 will be a time to gauge how the idea will work. By then, there should be enough market adoption of the Llama 2-powered Qualcomm devices to get a sense of Edge AI direction.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

Generative AI Investment Accelerating: $1.3 Billion for LLM Inflection

Not Nothing: Nothing 2 Powered by the Qualcomm Snapdragon 8 Gen 1 SoC

Qualcomm Snapdragon Wear 4100+ Platform: Helping Keep Children Safe

Author Information

Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.

Related Insights
Does FOXTRON's Adoption of Dimensity AX C-X1 Validate MediaTek's Automotive Ambitions?
June 10, 2026

Does FOXTRON’s Adoption of Dimensity AX C-X1 Validate MediaTek’s Automotive Ambitions?

Olivier Blanchard, Research Director at Futurum, examines how FOXTRON's adoption of MediaTek's Dimensity AX C-X1 platform moves AI-defined vehicle ambitions from platform development into commercial automotive deployment....
MediaTek’s Maturing Edge-to-Cloud AI Strategy Expands Beyond Smartphones
June 10, 2026

MediaTek’s Maturing Edge-to-Cloud AI Strategy Expands Beyond Smartphones

Olivier Blanchard, Research Director at Futurum, examines how MediaTek is using Agentic AI, automotive platforms, connectivity, and data center infrastructure to build an edge-to-cloud AI strategy that extends beyond smartphones....
Agentic AI
June 9, 2026

Atos Bets Big on Microsoft Copilot: Will Secure Agentic AI Redefine Enterprise Standards?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, Atos' large-scale agentic AI deployment signals accelerating enterprise adoption of autonomous AI agents across regulated sectors....
Creatio's Unlimited Enterprise Goes All-In On Unlimited Pricing
June 9, 2026

Creatio’s Unlimited Enterprise Goes All-In On Unlimited Pricing

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, Creatio's Unlimited Enterprise model eliminates per-user licensing constraints, forcing software vendors to rethink pricing in an AI-native...
Will Pega's Flat-Rate AI Model Force a Rethink of Token-Based Pricing in Enterprise Automation?
June 9, 2026

Will Pega’s Flat-Rate AI Model Force a Rethink of Token-Based Pricing in Enterprise Automation?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, Pega Infinity 26 eliminates unpredictable AI costs with outcome-based flat-rate pricing, reshaping enterprise automation investments....
Can Pega's Customer Engagement Studio Redefine Agentic AI for Marketing Leaders?
June 9, 2026

Can Pega’s Customer Engagement Studio Redefine Agentic AI for Marketing Leaders?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, Pega's Customer Engagement Studio uses agentic AI to unify marketing, accelerate campaigns, and enforce governance at enterprise...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.