Currently, most AI offerings are highly customized and designed to operate with specific hardware, either a particular vendor’s CPUs or a specialized hardware accelerator such as a GPU. Although the operational stacks in use vary across different operational environments, they maintain a core similarity and adapt to each specific hardware requirement.
Today, the conversation around Generative-AI LLMs often revolves around their training and the methods for enhancing their capabilities. However, the true value of AI comes to light when we deploy it in production. This Proof of Concept focuses on the application of generative AI models to generate useful results. Here, the term ‘inferencing’ is used to describe the process of extracting results from an AI application.
Our latest Lab Insight Report, Dell POC for Scalable and Heterogeneous Gen-AI Platform, outlines a Proof of Concept, and we investigate the ability to perform scale-out inferencing for production and to utilize a similar inferencing software stack across heterogeneous CPU and GPU systems to accommodate different production requirements.
This Lab Insight Report highlights the following:
- A single CPU based system can support multiple, simultaneous, real-time sessions
- GPU augmented clusters can support hundreds of simultaneous, real-time sessions
- A common AI inferencing software architecture is used across heterogenous hardware
Designed to be industry-agnostic, this PoC provides an example of how we can create a general-purpose generative AI solution that can utilize a variety of hardware options to meet specific Gen-AI application requirements. If you are interested in learning more, download your copy of Dell POC for Scalable and Heterogeneous Gen-AI Platform, today. In addition, you can download our Executive Summary and our Infographic at the links below.
Download our Executive Summary & Infographic.
In partnership with:
Download Now
Author Information
Russ brings over 25 years of diverse experience in the IT industry to his role at The Futurum Group. As a partner at Evaluator Group, he built the highly successful lab practice, including IOmark benchmarking.
Prior to Evaluator Group he worked as a Technology Evangelist and Storage Marketing Manager at Sun Microsystems. He was previously a technologist at Solbourne Computers in their test department and later moved to Fujitsu Computer Products. He started his tenure at Fujitsu as an engineer and later transitioned into IT administration and management.
Russ possesses a unique perspective on the industry through his experience as both a product marketing and IT consumer.
A Colorado native, Russ holds a Bachelor of Science in Applied Math and Computer Science from University of Colorado, Boulder, as well as a Master of Business Administration in International Business and Information Technology from University of Colorado, Denver.