AWS re:Invent: AWS Unveils Next-Generation Graviton, Trainium Chips

AWS re:Invent: AWS Unveils Next-Generation Graviton, Trainium Chips

The News: Amazon Web Services (AWS) launched the next generation of its Graviton and Trainium chips at re:Invent, improving performance that will help power in-memory databases, machine learning (ML) training, and generative AI applications. Read the full announcement on the AWS website.

AWS re:Invent: AWS Unveils Next-Gen Graviton, Trainium Chips

Analyst Take: AWS took the opportunity of its annual AWS re:Invent conference to double down on its custom silicon strategy with updates to its Graviton and Trainium offerings. AWS, Microsoft with Maia, and Google’s Tensor Processing Unit (TPU) offerings stand as pivotal players in the ever-evolving landscape of custom silicon for AI workloads. AWS’s relentless pursuit of innovation, evident in its Graviton processors and Inferentia chips since 2018, has empowered enterprises to optimize their AI workloads with cost-effective, high-performance solutions. Microsoft, with the Maia architecture, has demonstrated a commitment to providing versatile AI hardware that seamlessly integrates with Azure, offering businesses a comprehensive ecosystem for AI-driven endeavors. Google’s TPUs continue to shine as a testament to its dedication to AI acceleration, delivering exceptional processing power and efficiency for complex AI models. These tech giants are shaping the future of custom silicon, offering businesses the tools needed to excel in the realm of AI workloads.

AWS-designed Graviton4 and Trainium2 will be used in workloads and applications running in Amazon Elastic Compute Cloud (Amazon EC2). Both are designed by AWS. Graviton4 is a general-purpose microprocessor chip for large workloads, and Trainium2 accelerator chips are built for high-performance training of foundation models (FMs) and large language models (LLMs) with billions of parameters.

Performance Claims

The most recent iterations of these chips have witnessed remarkable advancements in both performance and power efficiency, as officially reported by AWS. Graviton4, in particular, stands out with its impressive capabilities, boasting up to a 30% enhancement in compute performance compared with its predecessor, the Graviton3 processors. Additionally, it incorporates 50% more cores, amplifying its processing potential, and an astounding 75% boost in memory bandwidth, facilitating faster data access and manipulation.

The advancements of Trainium from generation to generation is really impressive. Trainium2 represents a significant leap forward in the realm of AI model training. It has been meticulously designed to deliver a staggering fourfold increase in training speed when compared with the initial generation of Trainium chips. This transformative leap in speed is poised to revolutionize AI model training, allowing for quicker iterations and more agile development. Furthermore, AWS’s deployment of Trainium2 in EC2 UltraClusters, with the capability to scale up to 100,000 chips, opens the door to training Fine-Grained Model Search (FMS) and LLMs at unprecedented speeds while concurrently enhancing energy efficiency, a pivotal consideration in today’s environmentally conscious computing landscape. These remarkable advancements in both Graviton4 and Trainium2 underscore AWS’s commitment to pushing the boundaries of performance and efficiency in the custom silicon domain. While generational comparisons are interesting, we will be looking for more comparative tests from our Futurum Labs team before we pass judgement on the competitive landscape for these new offerings.

Use Cases

AWS claims to have built more than 2 million Graviton processors and has more than 50,000 customers—including the top 100 EC2 customers—using Graviton-based instances. These customers include Datadog, DirecTV, Discovery, Formula 1 (F1), NextRoll, Nielsen, Pinterest, SAP, Snowflake, Sprinklr, Stripe, and Zendesk. Graviton is supported by AWS managed services such as Amazon Aurora, Amazon ElastiCache, Amazon EMR, Amazon MemoryDB, Amazon OpenSearch, Amazon Relational Database Service (Amazon RDS), AWS Fargate, and AWS Lambda.

Graviton4 will be available in memory-optimized Amazon EC2 R8g instances, used for high-performance databases, in-memory caches, and big data analytics workloads. R8g instances provide up to 3x more vCPUs and 3x more memory than current generation R7g instances. Graviton4-powered R8g instances are available today in preview.

Trainium2 will be available in Amazon EC2 Trn2 instances, containing 16 Trainium chips in a single instance. Trn2 instances can help customers scale up to 100,000 Trainium2 chips in next-generation EC2 UltraClusters, interconnected with AWS Elastic Fabric Adapter (EFA) petabit-scale networking. Their compute and scale can cut LLM training time considerably, making them a good fit for generative AI.

Looking Ahead

In the fiercely competitive custom silicon arena, AWS maintains a substantial time in market lead over Microsoft and Google, with AWS claiming over 10,000 customers for Graviton alone. AWS’s Graviton processors, now in their fourth generation, showcase a commitment to ongoing innovation. This continuous improvement translates into superior performance, lower latency, and increased efficiency for AWS customers. AWS also stands out for its flexibility, offering chips from AMD, Intel, and NVIDIA for EC2 workloads. The mix of merchant and homegrown silicon will pivot to homegrown, because AWS AI silicon is so good. In contrast, Microsoft and Google are still catching up, with AWS’s established presence casting a long shadow. While they show promise, AWS’s multi-generation lead, consistent improvement, and diverse chip ecosystem solidify its position as the leader in custom silicon for AI workloads.

The ability to control the ecosystem, create greater economies of scale, and become stickier with its own chips is good business for AWS, and although it is not terrible for merchant silicon, it is hard to see a situation where this does not become more of a focus, a margin creator, and lead go-to-market for AWS.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

AWS Chip Lab Tour

AWS Serves Up NVIDIA GPUs for Short-Duration AI/ML Workloads

Microsoft’s Customer Silicon: A Game-Changer for AI and Cloud Computing?

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

Regarded as a luminary at the intersection of technology and business transformation, Steven Dickens is the Vice President and Practice Leader for Hybrid Cloud, Infrastructure, and Operations at The Futurum Group. With a distinguished track record as a Forbes contributor and a ranking among the Top 10 Analysts by ARInsights, Steven's unique vantage point enables him to chart the nexus between emergent technologies and disruptive innovation, offering unparalleled insights for global enterprises.

Steven's expertise spans a broad spectrum of technologies that drive modern enterprises. Notable among these are open source, hybrid cloud, mission-critical infrastructure, cryptocurrencies, blockchain, and FinTech innovation. His work is foundational in aligning the strategic imperatives of C-suite executives with the practical needs of end users and technology practitioners, serving as a catalyst for optimizing the return on technology investments.

Over the years, Steven has been an integral part of industry behemoths including Broadcom, Hewlett Packard Enterprise (HPE), and IBM. His exceptional ability to pioneer multi-hundred-million-dollar products and to lead global sales teams with revenues in the same echelon has consistently demonstrated his capability for high-impact leadership.

Steven serves as a thought leader in various technology consortiums. He was a founding board member and former Chairperson of the Open Mainframe Project, under the aegis of the Linux Foundation. His role as a Board Advisor continues to shape the advocacy for open source implementations of mainframe technologies.

Dave’s focus within The Futurum Group is concentrated in the rapidly evolving integrated infrastructure and cloud storage markets. Before joining the Evaluator Group, Dave spent 25 years as a technology journalist and covered enterprise storage for more than 15 years. He most recently worked for 13 years at TechTarget as Editorial Director and Executive News Editor for storage, data protection and converged infrastructure. In 2020, Dave won an American Society of Business Professional Editors (ASBPE) national award for column writing.

His previous jobs covering technology include news editor at Byte and Switch, managing editor of EdTech Magazine, and features and new products editor at Windows Magazine. Before turning to technology, he was an editor and sports reporter for United Press International in New York for 12 years. A New Jersey native, Dave currently lives in northern Virginia.

Dave holds a Bachelor of Arts in Communication and Journalism from William Patterson University.


Latest Insights:

Azure for Operators Unveils the General Availability of Azure Operator Nexus Aimed Primarily at Running Mobile Workloads on Azure to Deliver Breakthrough CX
The Futurum Group’s Ron Westfall examines why the general availability of Azure Operator Nexus exemplifies Azure for Operator’s strategic commitment to empowering telecom operators with security, performance, and efficiency innovation.
On this episode of The Six Five In the Booth, hosts Daniel Newman and Patrick Moorhead welcome Dan Kusel, GM and Managing Partner at IBM and Usman Zafar, Assistant Vice President, Product Management & Development at AT&T at MWC 2024 for a conversation on the influence generative AI has on transforming the telecom industry.
On this episode of The Six Five – Insider, hosts Daniel Newman and Patrick Moorhead welcome Walter Sun, Global Head of AI at SAP for a conversation on SAP’s AI strategy.
The Futurum Group’s Steven Dickens and Sam Holschuh share their insights on the transformation of the data management and analytics industry along with Snowflake’s announcement of a new CEO.