Amazon EC2 G7e Goes GA With Blackwell GPUs. What Changes for AI Inference?

Amazon EC2 G7e Goes GA With Blackwell GPUs. What Changes for AI Inference

Analyst(s): Nick Patience
Publication Date: January 27, 2026

Amazon has announced the general availability of EC2 G7e instances, powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. The new instances target generative AI inference and graphics workloads, offering higher GPU memory, bandwidth, and networking capabilities compared to the prior G6e generation.

What is Covered in this Article:

  • Amazon’s launch of EC2 G7e instances powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs
  • Performance and architectural improvements over the previous G6e instance family
  • Supported workloads, instance configurations, and deployment options
  • Regional availability and purchasing models for EC2 G7e instances

The News: Amazon announced the general availability of Amazon Elastic Compute Cloud (EC2) G7e instances, accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. The new G7e instances are optimized for generative AI and graphics-intensive workloads, delivering up to 2.3x higher inference performance than the prior G6e generation.

G7e instances support up to eight Blackwell GPUs with 96 GB of memory per GPU, up to 192 vCPUs, up to 1,600 Gbps of networking bandwidth, and up to 2,048 GiB of system memory. The instances are available today in the US East (N. Virginia) and US East (Ohio) regions and can be purchased as On-Demand, Spot, or Savings Plan instances.

Amazon EC2 G7e Goes GA With Blackwell GPUs. What Changes for AI Inference?

Analyst Take: Amazon’s introduction of EC2 G7e instances marks the latest expansion of its GPU-accelerated compute portfolio, centered on higher inference performance and expanded memory capacity. G7e instances are positioned to support generative AI inference, spatial computing, scientific computing, and mixed graphics-and-AI workloads using NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. Compared to G6e, the new instances emphasize increased GPU memory, higher memory bandwidth, and improved interconnect and networking capabilities. Amazon states that these changes enable customers to run medium-sized models of up to 70B parameters with FP8 precision on a single GPU.

Increased GPU Memory and Bandwidth

G7e instances double GPU memory and deliver 1.85x higher GPU memory bandwidth compared to G6e instances, according to Amazon. Each Blackwell GPU provides 96 GB of memory, enabling larger models to run on a single GPU without sharding. Amazon specifically notes that this configuration supports medium-sized models of up to 70B parameters using FP8 precision. This increase in on-device memory reduces reliance on multi-GPU partitioning for certain inference workloads. As a result, G7e targets workloads that benefit from higher memory density per GPU rather than solely raw compute throughput.

Multi-GPU Scaling and Inter-GPU Communication

For workloads that exceed the capacity of a single GPU, G7e instances support NVIDIA GPUDirect Peer-to-Peer (P2P) over PCIe. Amazon highlights lower peer-to-peer latency for GPUs on the same PCIe switch and up to four times higher inter-GPU bandwidth compared to the L40s GPUs used in G6e instances. These improvements allow inference workloads to scale across multiple GPUs within a single node, supporting up to 768 GB of total GPU memory. Amazon positions this capability for larger models that require multi-GPU execution rather than single-GPU inference. The emphasis remains on reducing communication overhead within a node rather than across clusters.

Networking and Multi-Node Capabilities

G7e instances offer four times the networking bandwidth of G6e, enabling support for small-scale multi-node workloads. Multi-GPU configurations support NVIDIA GPUDirect RDMA with Elastic Fabric Adapter (EFA), reducing latency for GPU-to-GPU communication across nodes. Amazon also states that G7e supports NVIDIA GPUDirectStorage with Amazon FSx for Lustre, delivering up to 1.2 Tbps of throughput for faster model loading. These capabilities extend G7e beyond single-node inference into limited multi-node scenarios. However, Amazon frames these improvements as incremental enhancements rather than a shift toward large-scale distributed training.

Instance Configurations and Deployment Options

Amazon offers six G7e instance sizes, ranging from a single-GPU g7e.2xlarge to the eight-GPU g7e.48xlarge configuration. At the high end, instances support 192 vCPUs, 2 TB of system memory, and up to 15.2 TB of local NVMe SSD storage. G7e instances can be deployed using AWS Management Console, CLI, or SDKs, and are supported on Amazon ECS, Amazon EKS, and AWS Parallel Computing Service, with Amazon SageMaker support coming soon. The breadth of configurations suggests Amazon is targeting a wide range of inference and graphics use cases rather than a narrow workload profile. Overall, G7e extends Amazon’s EC2 GPU lineup with higher memory density and networking capacity rather than redefining its compute strategy.

What to Watch:

  • Adoption of G7e instances for single-GPU versus multi-GPU inference workloads
  • Customer uptake of GPUDirect P2P and RDMA features for multi-GPU configurations
  • Expansion of G7e regional availability beyond the US East regions
  • Timeline for Amazon SageMaker AI support for G7e instances

See the complete blog on the general availability of Amazon EC2 G7e instances accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on the Amazon website.

Declaration of generative AI and AI-assisted technologies in the writing process: This content has been generated with the support of artificial intelligence technologies. Due to the fast pace of content creation and the continuous evolution of data and information, The Futurum Group and its analysts strive to ensure the accuracy and factual integrity of the information presented. However, the opinions and interpretations expressed in this content reflect those of the individual author/analyst. The Futurum Group makes no guarantees regarding the completeness, accuracy, or reliability of any information contained herein. Readers are encouraged to verify facts independently and consult relevant sources for further clarification.

Disclosure: Futurum is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of Futurum as a whole.

Other insights from Futurum:

AWS European Sovereign Cloud Debuts with Independent EU Infrastructure

Amazon Q3 FY 2025 Earnings: AWS Reaccelerates, Retail and Ads Grow

AWS re:Invent 2025: Wrestling Back AI Leadership

Author Information

Nick Patience is VP and Practice Lead for AI Platforms at The Futurum Group. Nick is a thought leader on AI development, deployment, and adoption - an area he has researched for 25 years. Before Futurum, Nick was a Managing Analyst with S&P Global Market Intelligence, responsible for 451 Research’s coverage of Data, AI, Analytics, Information Security, and Risk. Nick became part of S&P Global through its 2019 acquisition of 451 Research, a pioneering analyst firm that Nick co-founded in 1999. He is a sought-after speaker and advisor, known for his expertise in the drivers of AI adoption, industry use cases, and the infrastructure behind its development and deployment. Nick also spent three years as a product marketing lead at Recommind (now part of OpenText), a machine learning-driven eDiscovery software company. Nick is based in London.

Related Insights
Will Edison International’s Board Refresh Accelerate Its AI and Digital Ambitions?
April 25, 2026

Will Edison International’s Board Refresh Accelerate Its AI and Digital Ambitions?

Edison International appoints M. Susan Hardwick as independent director, strengthening the utility's leadership as it confronts mounting pressure to modernize operations and leverage AI-driven infrastructure solutions....
Will GPT-5.5 Redefine Enterprise AI, or Hit the Limits of Trust and Control?
April 25, 2026

Will GPT-5.5 Redefine Enterprise AI, or Hit the Limits of Trust and Control?

OpenAI's GPT-5.5 launches as a transformative enterprise AI platform, yet adoption barriers around trust, reliability, and data privacy remain critical concerns for 78% of organizations planning AI budget increases....
GPT-5.5 Raises the Stakes: Can OpenAI Maintain Its Lead as Enterprise AI Matures?
April 25, 2026

GPT-5.5 Raises the Stakes: Can OpenAI Maintain Its Lead as Enterprise AI Matures?

OpenAI's GPT-5.5 launch marks a critical moment in enterprise AI adoption. With 68% of organizations at advanced GenAI stages, competition from Microsoft and Google intensifies as buyers prioritize reliability and...
Can IBM's RITS Platform and vLLM Reset the Bar for Enterprise AI Access?
April 25, 2026

Can IBM’s RITS Platform and vLLM Reset the Bar for Enterprise AI Access?

IBM Research's RITS Platform uses vLLM to centralize large language model access across enterprise teams, signaling a shift toward scalable, governed AI infrastructure that balances innovation, cost, and control....
Autonomous Enterprise
April 24, 2026

Will ServiceNow and Google Cloud’s AI Agent Alliance Disrupt the Autonomous Enterprise Race?

ServiceNow and Google Cloud partnered to deliver AI agent solutions for autonomous enterprise operations, targeting 5G, retail, and IT sectors while raising concerns about vendor lock-in and scalability....
Google's $750M Partner Bet Resets the Agentic Channel Playbook
April 24, 2026

Google’s $750M Partner Bet Resets the Agentic Channel Playbook

Tiffani Bova at Futurum examines Google's $750M agentic AI partner commitment and new alliance formations with Accenture, Deloitte, Salesforce, and Vista Equity that reset channel program expectations....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.