Analyst(s): Olivier Blanchard
Publication Date: September 18, 2025
Arm has unveiled the Lumex CSS platform, combining C1 CPUs, Mali G1 GPUs, and system IP to deliver up to 5x AI performance and 2x ray tracing on flagship devices. The platform enables OEMs to accelerate development cycles and scale AI-focused experiences.
What is Covered in this Article:
- Arm launches the Lumex CSS compute subsystem with C1 CPUs and Mali G1 GPUs
- SME2-enabled CPUs deliver up to 5x faster AI performance and 3x greater efficiency
- Mali G1-Ultra GPU doubles ray tracing performance and improves graphics throughput
- New SI L1 Interconnect and MMU L1 improve latency, cache, and virtualization
- Developer-ready Android 16 stack and KleidiAI integration across major frameworks
The News: Arm has introduced the Lumex Compute Subsystem (CSS) platform, a tightly integrated combination of CPUs, GPUs, and system IP designed to accelerate development cycles and enable high-performance, on-device AI.
The platform debuts the Armv9.3-based C1 CPU cluster with SME2 units and the Mali G1-Ultra GPU, both optimized for 3nm nodes. Lumex provides up to 5x AI performance uplift, 2x ray tracing performance, and production-ready implementations for partners building flagship mobile and consumer devices.
Arm’s Lumex CSS Aims To Accelerate On-Device AI Innovation
Analyst Take: Lumex CSS marks Arm’s boldest move yet to set the foundation for AI-focused mobile and consumer products. By uniting SME2-equipped CPUs, advanced GPUs, and scalable IP in a pre-integrated package, Arm is giving OEMs a shortcut to reduce design complexity, speed up development, and gain measurable boosts in performance and efficiency. The platform is both a technical leap and a strategic shift, positioning Arm as not just an IP supplier but a full subsystem designer.
SME2 CPU Performance Leadership
The C1 CPU cluster with SME2 brings significant generational improvements: a 5x AI performance increase and 3x efficiency gains over its predecessor. Benchmarks show a 30% boost across six performance tests, 15% faster results in top gaming and video streaming apps, and 12% lower power use for daily tasks such as video playback and browsing. The flagship C1-Ultra also shows double-digit instructions per cycle (IPC) gains compared to the Cortex-x925, setting a new standard for large-model inference and generative AI. These results establish the CPU cluster as a key driver for smooth, sustained AI experiences on mobile devices.
GPU Enhancements for AI and Gaming
The Mali G1-Ultra GPU strengthens Lumex’s value with a 20% gain in AI inference and a 20% jump across graphics benchmarks compared to the Immortalis-G925. Its new Ray Tracing Unit v2 doubles ray tracing performance, bringing desktop-level visuals to mobile titles such as Arena Breakout, Fortnite, Genshin Impact, and Honkai Star Rail. With scalability up to 24 cores, Mali G1-Ultra covers both high-end and performance-focused devices, making it a central piece for immersive, AI-powered mobile gaming.
System Interconnect and Memory Innovation
Arm’s SI L1 System Interconnect and MMU L1 tackle performance limits beyond compute. SI L1 features the industry’s most area-efficient cache with a 71% drop in leakage, cutting idle power use. It supports Arm’s Memory Tagging Extension (MTE) for stronger system security. MMU L1, meanwhile, enables secure, low-cost virtualization across device tiers. Paired with the Network-on-Chip (NoC) S3 interconnect for cost-conscious systems, these components make Lumex flexible enough for both flagship and mainstream products, showing Arm’s aim to future-proof AI-focused platforms.
Developer-Ready Ecosystem and Partner Backing
Arm supports Lumex with an Android 16-ready software stack and KleidiAI libraries integrated into PyTorch, ONNX Runtime, Google LiteRT, and Alibaba MNN. Developers get SME2 benefits right away without changing code, with apps such as Gmail, YouTube, and Google Photos already optimized. Vulkan counters, Perfetto, and Streamline give developers insight into fine-tuning performance and efficiency. Industry backing from Samsung, Honor, and Google highlights strong ecosystem support – Samsung pushing flagship AI, Honor driving premium mid-tier devices, and Google scaling models like Gemma 3. Together, these partnerships confirm Lumex as a ready-made base for billions of future AI-enabled devices.
Adjacent Market Significance
The way the Lumex CSS platform fits into Arm’s business ecosystem can be understood across three vectors:
- Licensing and Royalties: Arm’s primary revenue streams come from licensing fees and royalties. (Companies such as Apple, Qualcomm, and MediaTek pay Arm for the right to use its architecture and core designs in their own chips.) When a partner ships a chip containing Arm’s IP, Arm earns a royalty on each unit sold.
- The “Compute Subsystem” as an IP Package: As a pre-integrated and optimized package of Arm’s IP (which includes its latest CPU and GPU designs), the Lumex CSS “compute subsystem” is basically a blueprint. It is not a physical chip that Arm manufactures and sells directly to customers. Instead, it is a ready-to-use foundation that chipmakers can license to accelerate their own product development.
- Accelerating Partner Innovation: Platforms like the Lumex CSS aim to speed up partners’ time to market and reduce their development costs. Arm enables its partners to focus on customization and differentiation rather than building a chip from scratch by providing a full-stack, pre-verified solution. Additionally, this strengthens Arm’s ecosystem and increases its royalty revenue as more devices with its IP are shipped.
Despite rumors that Arm may be exploring the possibility of designing its own chips in the future, Arm remains a design and licensing company, not a hardware manufacturer, and this product follows that business model.
What to Watch:
- Whether OEMs adopt Lumex CSS in full or selectively integrate C1 CPUs and Mali GPUs
- Timing of first Lumex-enabled flagship devices hitting the market
- Competitive responses from Qualcomm, MediaTek, and proprietary chip vendors
- Software optimization challenges for developers deploying SME2-based workloads
- Adoption rates in cost-sensitive segments versus flagship devices
See the complete announcement on the Arm website.
Disclosure: Futurum is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of Futurum as a whole.
Other insights from Futurum:
Arm Q1 FY26 Earnings: Revenue Surpasses $1B on Surging AI and Cloud Demand
Can Arm’s Zena CSS Reshape Automotive AI Development Timelines?
Qualcomm’s Arm-Based Data Center CPUs To Smoothly Integrate With NVIDIA
Author Information
Olivier Blanchard is Research Director, Intelligent Devices. He covers edge semiconductors and intelligent AI-capable devices for Futurum. In addition to having co-authored several books about digital transformation and AI with Futurum Group CEO Daniel Newman, Blanchard brings considerable experience demystifying new and emerging technologies, advising clients on how best to future-proof their organizations, and helping maximize the positive impacts of technology disruption while mitigating their potentially negative effects. Follow his extended analysis on X and LinkedIn.
