The News: In the Oracle CloudWorld keynote, Larry Ellison highlighted the Oracle Cloud Infrastructure (OCI) Supercluster as key to Oracle’s AI strategy for customers. Designed for machine learning (ML) training, this architecture scales from 512 to 16,000 NVIDIA H100 graphics processing units (GPUs), all connected by a 200 Gbps remote direct memory access (RDMA) network, boasting low latency. See the complete Press Release on the Oracle website.
Unlocking Enterprise AI Ownership: Oracle’s OCI Supercluster
Analyst Take: Available as a bare-metal service later this year out of the London and Chicago regions of OCI, OCI Supercluster represents a real and practical path forward for enterprise customers looking to “own their own AI.” Do-it-yourself options for AI infrastructure are almost completely cost prohibitive for all but the largest institutions, particularly as materials and supply problems persist in the chip market. But ownership of AI lifecycles is a critical element in enterprise adoption given AI’s particular need for relevant data to ensure accuracy and trust. In the enterprise, the most relevant data is proprietary data, and proprietary data means a requirement for privacy, provenance, and auditing protection.
How do you protect yourself and your partners while using proprietary data to train and use highly effective AI? There are two approaches. Most implementations use retrieval augmented generation (RAG), which adds tokenization of the user prompt and immediate search results to an AI model already trained on publicly available data sets. If the search is an enterprise search, its results can be proprietary and protected while still honing AI inference.
RAG, and RAG-enabling capabilities such as vector database search, have been incorporated throughout the Oracle portfolio. OCI Supercluster is one of the key enablers of this approach, but as a component of other Oracle products and services rather than directly accessed by customers who use this first approach. One such Oracle offering is OCI’s Generative AI service, supporting large language models (LLMs) trained by Cohere.
The second approach to using proprietary data for AI, however, is the only approach if the AI application cannot be based on a readily available ML model. The organization must then train its own models, and to do so, it will need well-architected, private, secure infrastructure. Well-architected infrastructure in this case is a system built for ML training that is highly scalable, has very high bandwidth to the memory and from node to node, and of course, offers low latency to massive quantities of training data available at high throughput. These features are the OCI Supercluster offer.
A Bellwether of the AI Path Forward
AI can only advance by, sooner or later, enabling leading-edge adopters of AI to push outside the boundaries of generative AI and the LLMs that have so captured the market and popular imagination. LLMs are only one kind of AI model, after all. Classification, time-series forecasting, decision trees, recommendation/diagnosis, and predictive regression are examples of modeling opportunities surely less comprehensible to the general public but with great potential, especially in certain industries and lines of scientific research. These kinds of AI will progress and even find their moment sometime, somewhere, and it is infrastructure such as OCI Supercluster, made available as private cloud services, that will allow enterprising organizations to pursue them.
This thinking is in the frame of the market as a whole and the maturing technologies that will enable its growth. Switching frames to the practical matter of an individual organization’s research and development strategy for AI, it is obvious that a supercharged, souped up, super big Supercluster is not needed for every ML model at every stage of its lifecycle. Oracle’s press release introduced two new infrastructure services for AI, only one of which is the bare-metal service mentioned earlier, running on the OCI Supercluster with its NVIDIA H100 Tensor Core GPUs numbering in the tens of thousands.
The other new AI service, also a bare-metal service but in this case suitable for more modest ML training as well as for AI inferencing (in other words, for running a trained AI model), does not run on the OCI Supercluster but rather on more conventional—if still advanced—infrastructure accelerated by NVIDIA L40S GPUs. Both the H100 and the L40S are next-generation NVIDIA GPUs, using the Hopper and Ada Lovelace microarchitectures, respectively. The L40S-based service is planned for launch within the next year, but Oracle did not announce in which regions it will first be available.
It is worth noting that both of these services, as bare-metal services, require significant technical, architectural, and operational expertise by customers to implement and use. What is not required is space, power, cooling, procurement, racking-cabling, system setting and validation, hardware maintenance, or any other resource-sapping activities for IT. OCI Supercluster and these two services are a significant step forward, even if they do not warrant a new acronym like MLtaaS (ML training as a service) or goodness knows what.
Looking Ahead
Oracle’s introduction of the OCI Supercluster significantly elevates its competitive stance, particularly in the arena of AI/ML enterprise solutions. The Supercluster aims to address two pivotal challenges in AI adoption—cost and data privacy—by providing a robust, scalable infrastructure for enterprises to own their own AI. This approach positions Oracle uniquely in the market as it enables a higher degree of customization and control for businesses over their AI initiatives. It also offers a practical solution to the current supply chain issues plaguing the semiconductor industry.
Oracle’s multi-pronged approach—consisting of RAG and bespoke, in-house ML model training—offers flexibility for diverse enterprise needs. The integration of RAG across Oracle’s product line signifies a thoughtful alignment of its AI strategy, further amplified by the Generative AI service supported by Cohere. These capabilities make Oracle more than just an infrastructure provider; it becomes a full-stack AI solution provider.
Moreover, the company’s new offerings are not monolithic but span different scales and requirements. Although the OCI Supercluster focuses on high-end, compute-intensive training tasks with its next-gen NVIDIA H100 GPUs, Oracle also plans another, more modest service based on NVIDIA L40S GPUs. Both are designed as bare-metal services, indicating a shift away from resource-intensive on-premises solutions, which often impede AI implementation due to operational overhead.
This move by Oracle resonates on multiple levels. It not only represents an advance in technological capabilities but also marks a change in the market’s narrative. Previously, the hyperscale cloud provider space has been largely dominated by the likes of AWS, Azure, and Google Cloud in terms of AI/ML capabilities. Oracle’s Supercluster, and its alignment with advanced GPU technology, positions it as a versatile, powerful alternative for enterprises seeking specialized, yet comprehensive, AI/ML solutions.
In summary, Oracle’s strategy and new offerings reflect a keen understanding of the complex challenges that enterprises face in AI adoption. The OCI Supercluster serves as both a technological and strategic milestone, potentially disrupting the current equilibrium among leading cloud service providers and setting a new precedent for customer-centric, versatile AI solutions.
Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.
Other Insights from The Futurum Group:
Oracle Database Analyst Summit: Powering the Multi-Cloud Era and Liberating Developers
Oracle Frees Database 23c to Power Universal Modern Apps and Analytics Innovation
Author Information
Guy is the CTO at Visible Impact, responsible for positioning, GTM, and sales guidance across technologies and markets. He has decades of field experience describing technologies, their business and community value, and how they are evaluated and acquired. Guy’s specialty areas include cloud, DevOps/cloud-native/12-factor, enterprise applications, Big Data, governance-risk-compliance, containerization, virtualization, HPC, CPUs-GPUs, and systems lifecycle management.
Guy started his technology career as a research director for technology media company Ziff Davis, with stints at PC Magazine, eWeek, and CIO Insight. Prior to joining Visible Impact, he worked at Dell, including postings in marketing, product, and technical marketing groups for a wide range of products, including engineered systems, cloud infrastructure, enterprise software, and mission-critical cloud services. He lives and works in Austin, TX
Regarded as a luminary at the intersection of technology and business transformation, Steven Dickens is the Vice President and Practice Leader for Hybrid Cloud, Infrastructure, and Operations at The Futurum Group. With a distinguished track record as a Forbes contributor and a ranking among the Top 10 Analysts by ARInsights, Steven's unique vantage point enables him to chart the nexus between emergent technologies and disruptive innovation, offering unparalleled insights for global enterprises.
Steven's expertise spans a broad spectrum of technologies that drive modern enterprises. Notable among these are open source, hybrid cloud, mission-critical infrastructure, cryptocurrencies, blockchain, and FinTech innovation. His work is foundational in aligning the strategic imperatives of C-suite executives with the practical needs of end users and technology practitioners, serving as a catalyst for optimizing the return on technology investments.
Over the years, Steven has been an integral part of industry behemoths including Broadcom, Hewlett Packard Enterprise (HPE), and IBM. His exceptional ability to pioneer multi-hundred-million-dollar products and to lead global sales teams with revenues in the same echelon has consistently demonstrated his capability for high-impact leadership.
Steven serves as a thought leader in various technology consortiums. He was a founding board member and former Chairperson of the Open Mainframe Project, under the aegis of the Linux Foundation. His role as a Board Advisor continues to shape the advocacy for open source implementations of mainframe technologies.