Amazon EC2

Amazon EC2 P6e-GB200 UltraServers and P6-B200 instances

The highest GPU performance for AI training and inference

Reserve for future use

Why Amazon EC2 P6e-GB200 UltraServers and P6-B200 instances?

Amazon Elastic Compute Cloud (Amazon EC2) P6e-GB200 UltraServers, accelerated by NVIDIA GB200 NVL72, offer the highest GPU performance in Amazon EC2. They feature over 20x compute and over 11x memory under NVIDIA NVLink^TM compared to P5en instances. P6e-GB200 UltraServers are ideal for the most compute- and memory-intensive AI workloads, such as training and deploying frontier models at the trillion-parameter scale.

Amazon EC2 P6-B200 instances, accelerated by NVIDIA Blackwell GPUs, are an ideal option for medium-to-large scale training and inference applications. They offer up to 2x performance compared to P5en instances for AI training and inference.

P6e-GB200 UltraServers and P6-B200 instances enable faster training for next-generation AI models and improve performance for real-time inference in production. You can use P6e-GB200 UltraServers and P6-B200 instances to train frontier foundation models (FMs) such as mixture of experts and reasoning models and deploy them in generative and agentic AI applications such as content generation, enterprise copilots, and deep research agents.

Benefits

Maximize training and inference performance at scale

With P6e-GB200 UltraServers, customers can access up to 72 Blackwell GPUs within one NVLink domain to use 360 petaflops of FP8 compute (without sparsity) and 13.4 TB of total high- bandwidth memory (HBM3e). P6e-GB200 UltraServers provide up to 130 terabytes per second of low-latency NVLink connectivity between GPUs and up to 28.8 terabits per second of total Elastic Fabric Adapter networking (EFAv4) for AI training and inference. This UltraServer architecture on P6e-GB200 enables customers to leverage a step change improvement in compute and memory, with up to 20x GPU TFLOPS, 11x GPU memory, and 15x aggregate GPU memory bandwidth under NVLink compared to P5en.

P6-B200 instances provide 8x NVIDIA Blackwell GPUs with 1440 GB of high bandwidth GPU memory, 5th Generation Intel Xeon Scalable processors (Emerald Rapids), 2 TiB of system memory, up to 14.4 TB/s of total bidirectional NVLink bandwidth, and 30 TB of local NVMe storage. These instances feature up to 2.25x GPU TFLOPs, 1.27x GPU memory size, and 1.6x GPU memory bandwidth compared to P5en instances.

Get enhanced security and stability with the AWS Nitro System

P6e-GB200 UltraServers and P6-B200 instances are powered by the AWS Nitro System with specialized hardware and firmware designed to enforce restrictions so that no one, including anyone at AWS, can access your sensitive AI workloads and data. The Nitro System, which handles networking, storage, and other I/O functions, can deploy firmware updates, bug fixes, and optimizations while it remains operational. This increases stability and reduces downtime, which is critical to meeting training timelines and running AI applications in production.

Reliably scale AI training across high-performing GPU clusters

To enable efficient distributed training, P6e-GB200 UltraServers and P6-B200 instances use fourth-generation Elastic Fabric Adapter networking (EFAv4). EFAv4 uses Scalable Reliable Datagram protocol to intelligently route traffic across multiple network paths to maintain smooth operation even during congestion or failures.

P6e-GB200 UltraServers and P6-B200 instances are deployed in Amazon EC2 UltraClusters that enable scaling up to tens of thousands of GPUs within a petabit-scale nonblocking network.

Features

NVIDIA Blackwell GPUs

Each NVIDIA Blackwell GPU features a second-generation Transformer Engine and supports new precision formats such as FP4. It supports fifth-generation NVLink, a faster, wider interconnect delivering up to 1.8 TB/s of bandwidth per GPU.

The Grace Blackwell Superchip, a key component of P6e-GB200, connects two high- performance NVIDIA Blackwell GPUs and an NVIDIA Grace CPU using the NVIDIA NVLink-C2C interconnect. Each Superchip delivers 10 petaflops of FP8 compute (without sparsity) and up to 372 GB of HBM3e. With the superchip architecture, 2 GPUs and 1 CPU are co-located within one compute module, increasing bandwidth between GPU and CPU by an order of magnitude compared to current generation P5en instances.

High-performance networking

P6e-GB200 UltraServers and P6-B200 instances provide 400 GB per second per GPU of EFAv4 networking for a total of 28.8 Tbps per P6e-GB200 UltraServer and 3.2 Tbps per P6-B200 instance.

High-performance storage

P6e-GB200 UltraServers and P6-B200 instances support Amazon FSx for Lustre file systems so you can access data at hundreds of GB/s of throughput and millions of IOPS required for large- scale AI training and inference. P6e-GB200 UltraServers support up to 405 TB of local NVMe SSD storage while P6-B200 instances support up to 30 TB of local NVMe SSD storage for fast access to large datasets. You can also use virtually unlimited cost-effective storage with Amazon Simple Storage Service (Amazon S3).

Product Details

Instance types

Instance Size	Blackwell GPUs	GPU memory (GB)	vCPUs	System memory (GiB)	Instance storage (TB)	Network bandwidth (Gbps)	EBS bandwidth (Gbps)	Available in EC2 UltraServers
p6-b200.48xlarge	8	1,432 HBM3e	192	2048	8 x 3.84	8 x 400	100	No
p6e-gb200.36xlarge*	4	740 HBM3e	144	960	3 x 7.5	4 x 400	60	Yes

*p6e-gb200.36xlarge instances are only available via P6e-GB200 UltraServers. See below for P6e-GB200 UltraServer types.

UltraServer types

Instance Size	Blackwell GPUs	GPU memory (GB)	vCPUs	System memory (GiB)	UltraServer Storage (TB)	Aggregate EFA bandwidth (Gbps)	EBS bandwidth (Gbps)	Available in EC2 UltraServers
u-p6e-gb200x72	72	13320	2592	17280	405	28800	1080	Yes
u-p6e-gb200x36	36	6660	1296	8640	202.5	14400	540	Yes

Customer testimonials

Here are some examples of how customers and partners have achieved their business goals with Amazon EC2 P6e-GB200 UltraServers and P6-B200 instances.

JetBrains

We are extensively using Amazon EC2 P5en instances, and are excited about the launch of the P6 and P6e instances featuring NVIDIA Blackwell GPUs, which promise substantial performance improvements. Preliminary out of the box evaluations indicated >85% faster training times on P6-B200 over H200 based P5en instances across our ML pipelines, with further optimizations expected to deliver even greater gains. This advancement will help us build outstanding products for our customers.

Vladislav Tankov, Director of AI, JetBrains

Getting started with ML use cases

Using Amazon SageMaker

Amazon SageMaker is a fully managed service for building, training, and deploying ML models. With Amazon SageMaker HyperPod, you can more easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up and managing resilient training clusters. (P6e-GB200 support coming soon)

Using AWS DLAMI or AWS Deep Learning Containers

AWS Deep Learning AMIs (DLAMI) provides ML practitioners and researchers with the infrastructure and tools to accelerate DL in the cloud, at any scale. AWS Deep Learning Containers are Docker images preinstalled with DL frameworks to streamline the deployment of custom ML environments by letting you skip the complicated process of building and optimizing your environments from scratch.

Using Amazon EKS or Amazon ECS

If you prefer to manage your own containerized workloads through container orchestration services, you can deploy P6e-GB200 UltraServers and P6-B200 instances with Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS).

Using DGX Cloud in AWS

P6e-GB200 UltraServers will also be available through NVIDA DGX Cloud, a fully managed environment with NVIDIA’s complete AI software stack. You get NVIDIA’s latest optimizations, benchmarking recipes, and technical expertise.

Learn more

Getting started with AWS

Step 1 : Sign up for an AWS account

Instantly get access to the AWS Free Tier

Learn more

Step 2: Learn with 10-minute tutorials

Explore and learn with simple tutorials

Learn more

Step 3: Start building with AWS

Begin building with step-by-step guides to help you launch your AWS project

Learn more

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages

Amazon EC2 P6e-GB200 UltraServers and P6-B200 instances

Why Amazon EC2 P6e-GB200 UltraServers and P6-B200 instances?

Benefits

Maximize training and inference performance at scale

Get enhanced security and stability with the AWS Nitro System

Reliably scale AI training across high-performing GPU clusters

Features

NVIDIA Blackwell GPUs

High-performance networking

High-performance storage

Product Details

Instance types

UltraServer types

Customer testimonials

JetBrains

Getting started with ML use cases

Using Amazon SageMaker

Using AWS DLAMI or AWS Deep Learning Containers

Using Amazon EKS or Amazon ECS

Using DGX Cloud in AWS

Getting started with AWS

Step 1 : Sign up for an AWS account

Step 2: Learn with 10-minute tutorials

Step 3: Start building with AWS

Did you find what you were looking for today?

Learn

Resources

Developers

Help