Skip to main content

Amazon EC2 P6-B200 instances

Get high performance for AI training, inference, and HPC workloads

Why Amazon EC2 P6-B200 instances?

Amazon EC2 P6-B200 instances, accelerated by NVIDIA Blackwell GPUs, offer up to 2x performance compared to P5en instances for AI training and inference. They enable faster training for next-generation AI models and improve performance for real time inference in production workloads. P6-B200 instances are an ideal option for medium-to-large scale training and inference applications that use reasoning models and agentic AI.

Benefits

P6-B200 instances provide 8x NVIDIA Blackwell GPUs with 1440 GB of high bandwidth GPU memory, 5th Generation Intel Xeon Scalable processors (Emerald Rapids), 2 TiB of system memory, and 30 TB of local NVMe storage. These instances with the Blackwell GPUs feature up to 125% increase in GPU TFLOPs, 27% increase in GPU memory size, and 60% increase in GPU memory bandwidth compared to P5en instances.

P6-B200 instances are powered by the AWS Nitro System with specialized hardware and firmware designed to enforce restrictions so that nobody, including anyone in AWS, can access your sensitive AI workloads and data. Nitro live update can deploy firmware updates, bug fixes, and optimizations to Nitro Cards while the system remains operational. This increases stability and reduces downtime, critical to meeting training timelines and running AI applications in production.

To enable efficient distributed training, P6-B200 instances provide 3.2 terabits per second of fourth-generation of Elastic Fabric Adapter networking (EFAv4). These instances are deployed in Amazon EC2 UltraClusters that enable scaling up to tens of thousands of GPUs within a petabit-scale nonblocking network.

P6-B200 instances improve time to train and cost to train, enabling model providers to accelerate time-to-market for bigger and more performant models. P6-B200 instances support a broad range of AI and HPC workloads, from deep learning training and inference to scientific simulations and computer vision applications. They are ideal for medium- to large-scale training and inference workloads.

Features

P6-B200 instances provide up to 8x NVIDIA Blackwell GPUs with 1440 GB of high-bandwidth GPU memory. They boast up to 125% improvement in GPU TFLOPs, 27% increase in GPU memory size and 60% increase in GPU memory bandwidth compared to P5en instances.

P6-B200 instances deliver up to 3.2 terabits per second of EFAv4 networking and 1800 GB/s GPU to GPU interconnect through NVLink. 

P6-B200 instances support Amazon FSx for Lustre file systems so you can access data at the hundreds of GB/s of throughput and millions of IOPS required for large-scale DL and HPC workloads. P6-B200 instances support up to 30 TB of local NVMe SSD storage for fast access to large datasets. You can also use virtually unlimited cost-effective storage with Amazon Simple Storage Service (Amazon S3).

Product Details

Instance Size
Available in EC2 UltraServers
Blackwell GPUs
GPU memory
vCPUs
Memory (TiB)
Instance storage (TB)
Network bandwidth (Gbps)
EBS bandwidth (Gbps)
p6-b200.48xlarge

No

8

1,440 HBM3e

192

2

8 x 3.84

8 x 400

100

Getting started with ML use cases

Amazon SageMaker is a fully managed service for building, training, and deploying ML models. With Amazon SageMaker HyperPod (P6-B200 support coming soon), you can more easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up and managing resilient training clusters.

AWS Deep Learning AMIs (DLAMI) provides ML practitioners and researchers with the infrastructure and tools to accelerate DL in the cloud, at any scale. AWS Deep Learning Containers are Docker images preinstalled with DL frameworks to streamline the deployment of custom ML environments by letting you skip the complicated process of building and optimizing your environments from scratch.

If you prefer to manage your own containerized workloads through container orchestration services, you can deploy P6-B200 instances with Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS).

Getting started with HPC use cases

P6-B200 instances are ideal to run engineering simulations, computational finance, seismic analysis, molecular modeling, genomics, and other GPU-based HPC workloads. HPC applications often require high network performance, fast storage, large amounts of memory, high compute capabilities, or all of the above. P6-B200 instances support Elastic Fabric Adapter (EFA), which enables HPC applications using the Message Passing Interface (MPI) to scale to thousands of GPUs. AWS Batch and AWS ParallelCluster help HPC developers quickly build and scale distributed HPC applications.

Learn more