Skip to main content

Amazon EC2

Amazon EC2 G7e instances

High performance NVIDIA GPU-based instances for AI inference, scientific computing and spatial computing workloads

Why Amazon EC2 G7e instances?

Amazon Elastic Compute Cloud (Amazon EC2) G7e instances, accelerated by NVIDIA RTX PRO™ 6000 Blackwell Server Edition GPUs, deliver cost-effective performance for generative AI inference workloads and the highest performance for spatial computing workloads. These instances offer 2x the GPU memory (96 GB), 1.85x the GPU memory bandwidth, up to 4x the inter-GPU communication bandwidth and 4x the Elastic Fabric Adapter (EFA) networking bandwidth compared to G6e instances. G7e instances offer up to 2.3x inference performance compared to G6e.

Customers can use G7e instances to deploy large language models (LLMs), agentic AI models, multimodal generative AI models and physical AI models. Additionally, G7e instances can be used to accelerate a broad range of workloads including spatial computing and scientific computing workloads.

G7e instances feature up to 8 NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs with 768 GB of total GPU memory (96 GB of memory per GPU) and 5th generation Intel Xeon Scalable (Emerald Rapids) processors. G7e instances support up to 192 vCPUs, up to 1600 Gbps of networking bandwidth with EFA, up to 2 TiB of system memory, and up to 15.2 TB of local NVMe SSD storage.

Benefits

G7e instances offer up to 2.3x inference performance compared to G6e. NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs offer 1.85x the GPU memory bandwidth compared to G6e instances, enabling customers to deploy real-time agentic AI and multi-modal AI inference workloads. These instances offer up to 4x the CPU to GPU bandwidth compared to G6e instances, improving inference performance for recommender and Retrieval-Augmented Generation (RAG) workloads. Additionally, the higher GPU-to-GPU bandwidth and support for NVIDIA GPUDirect P2P via PCIe enables G7e to run inference for larger models across multiple GPUs.

NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs feature fourth-generation NVIDIA ray tracing cores built to use neural graphics-based technologies and new streaming processors optimized for neural shaders. G7e instances offer 1.7x RT core TFLOPs performance compared to G6e instances and deliver the highest performance for spatial computing workloads as well as workloads that combine graphics and AI such as robotic simulation, avatar-based chat assistants, and digital twins. G7e instances offer cost-effective high performance for customers with applications that need both graphics and AI.

G7e instances offer up to 1.27x the TFLOPs, 2x the GPU memory and up to 4x the GPU-to-GPU bandwidth compared to G6e instances. This makes them well suited for cost-efficient single- node fine-tuning or training for natural language processing (NLP), computer vision, and smaller GenAI models. Additionally, these instances offer 4x higher EFA networking bandwidth (1600 Gbps) compared to G6e instances and support NVIDIA GPUDirect RDMA for multi-GPU instances, enabling customers to use G7e instances for small-scale multi- node fine-tuning and training.

G7e instances are built on the AWS Nitro System, a combination of dedicated hardware and lightweight hypervisor which delivers practically all of the compute and memory resources of the host hardware to your instances for better overall performance. With G7e instances, the Nitro system provisions the GPUs in a pass-through mode, providing performance comparable to bare-metal. The AWS Nitro System with its specialized hardware and firmware designed to enforce restrictions so that no one, including anyone at AWS, can access your sensitive AI workloads and data. Additionally, the Nitro System, which handles networking, storage, and other I/O functions, can deploy firmware updates, bug fixes, and optimizations while it remains operational. This increases stability and reduces downtime, which is important for deploying AI applications in production.

Features

G7e instances feature up to 8 NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. Each GPU offers 96 GB of GDDR7 memory that delivers 1597 GB/s memory bandwidth. These GPUs come with fifth-generation NVIDIA Tensor Cores that support FP4 precision for faster AI performance and reduced GPU memory usage, fourth-generation NVIDIA ray tracing cores built to leverage neural graphics-based technologies such as RTX Mega Geometry, and new streaming processors that integrate neural networks inside of programmable shaders. Additionally, each GPU features 4 ninth-generation NVENC and 4 sixth-generation NVDEC engines with support for 4:2:2 encoding and decoding. The GPUs also support DLSS 4.0 multi-frame generation technology.

G7e instances support up to 1600 Gbps of network bandwidth with EFA and cluster placement groups, which is 4x the networking bandwidth of G6e instances. Multi-GPU G7e instances support NVIDIA GPUDirect RDMA with EFAv4 in EC2 UltraClusters, reducing latency for small-scale multi-node workloads compared to G6e instances. Additionally, G7e also supports NVIDIA GPUDirect P2P via PCIe, enabling them to support the low latency needs of machine learning inference and graphics-intensive applications that require multiple GPUs.

G7e instances offer up to 2048 GiB of system memory and up to 15.2 TB of local NVMe SSD storage. This enables local storage of large models and datasets for machine learning inference, fine-tuning, and training. G7e instances offer up to 4x the bandwidth between the GPU and local storage compared to G6e instances, enabling customers to load or swap models quickly from the local storage to the GPU memory. Additionally, multi-GPU G7e instances support NVIDIA GPUDirect Storage with FSx for Lustre, increasing throughput to the instances compared to G6e instances.

G7e instances offer NVIDIA RTX Enterprise and gaming drivers to customers at no additional cost. NVIDIA RTX Enterprise drivers can be used to provide high quality virtual workstations for a wide range of graphics-intensive workloads. NVIDIA gaming drivers provide unparalleled graphics and compute support for game development. G7e instances also support CUDA, cuDNN, NVENC, TensorRT, cuBLAS, OpenCL, DirectX 11/12, Vulkan 1.3, and OpenGL 4.6 libraries.

Product Details

Instance types

Instance size
GPUs
GPU memory (GB)
vCPUs
System memory (GiB)
Instance storage (TB)
EBS bandwidth (Gbps)
Network bandwidth (Gbps)
g7e.2xlarge

1

96

8

64

1.9 x 1

Up to 5

50

g7e.4xlarge

1

96

16

128

1.9 x 1

8

50

g7e.8xlarge

1

96

32

256

1.9 x 1

16

100

g7e.12xlarge

2

192

48

512

3.8 x 1

25

400

g7e.24xlarge

4

384

96

1024

3.8 x 2

50

800

g7e.48xlarge

8

768

192

2048

3.8 x 4

100

1600

Customer testimonials

Agility Robotics

"Agility Robotics’ mission is to build robot partners that augment the human workforce, ultimately enabling humans to be more human. Agility’s groundbreaking bi-pedal humanoid, Digit, is the first general- purpose, human-centric robot that is made for work. Amazon G7e with NVIDIA RTX PRO 6000 Blackwell GPUs combine the speed and memory to let us train higher performance whole-body controls more quickly for our humanoid robots. These are the best machines out there for doing sim-to-real reinforcement learning for robotics at scale."

Pras Velagapudi - CTO at Agility Robotics

Missing alt text value

Synopsys

Synopsys is the leader in engineering solutions from silicon to systems, enabling customers to rapidly innovate AI-powered products. They deliver industry-leading silicon design, IP, and simulation and analysis solutions.

“Synopsys' simulation and analysis software is used to predict how products will perform in the real world for a wide range of simulations including structural analysis, fluid dynamics, and electronics. Engineers must balance a multitude of competing design objectives, requiring not only accurate design tools but also shorter simulation times. Amazon EC2 G7e instances featuring NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs promise significantly enhanced performance for Ansys Fluent and Ansys Mechanical simulations compared to previous generation instances. We're excited to recommend Amazon G7e EC2 instances to our customers as a more cost-effective option that delivers superior performance compared to previous GPU generations."

Richard Mitchell, Senior Director of Product Management, Synopsys

Missing alt text value

ARI

“ARI is building super-intelligent humanoid robots at scale to deliver abundant physical labor to humanity. We empower robots with foundation models to perceive, reason, and act safely in the real world. For training our humanoid robot foundation models, Amazon EC2 G7e instances have been a huge accelerator. The cluster is stable, large-scale training ‘just works,’ and we can iterate on complex models much faster while keeping costs under control. G7e gives us the reliability and performance we need to push the boundaries of embodied AI.”

Xiaolong Wang, Co-Founder, Assured Robot Intelligence (ARI)

Missing alt text value

Brave

"Brave is an independent browser and search engine, with over 100 million users, offering leading privacy and security protections that make the Web safer and easier to use. Brave Search needs high performance GPU based instances across its independent search stack that includes a large ensemble of embeddings, re-rankers, and LLMs. The massive performance boost we observed from G7e instances accelerated by NVIDIA RTX PRO™ 6000 GPUs compared to the previous generation is a game changer for us. It enables improving the quality of our responses while significantly reducing the latency.”

Rémi Berson, Principal Engineer, Brave Search

Missing alt text value

Getting started with AI use cases

Amazon SageMaker AI is a fully managed service for building, training, and deploying ML models. With Amazon SageMaker HyperPod, G7e instances can scale to dozens of GPUs to train a model quickly without worrying about setting up and managing resilient training clusters.

AWS Deep Learning AMIs (DLAMI) provides ML practitioners and researchers with the infrastructure and tools to accelerate DL in the cloud, at any scale. AWS Deep Learning Containers are Docker images preinstalled with DL frameworks to streamline the deployment of custom ML environments by letting you skip the complicated process of building and optimizing your environments from scratch.

If you prefer to manage your own containerized workloads through container orchestration services, you can deploy P6e-GB200 UltraServers and P6-B200 instances with Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS).

You can use various Amazon Machine Images (AMIs) offered by AWS and NVIDIA that come with the NVIDIA drivers installed.

Learn more

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages