Skip to main content

NVIDIA DGX Cloud on AWS

Accelerating generative and agentic AI

Scale AI model development

In the rapidly evolving AI landscape, organizations are looking to accelerate the deployment of generative and agentic AI solutions to unlock business value faster. Since 2010, AWS and NVIDIA have collaborated to deliver comprehensive AI infrastructure, software, and services. To further help organizations reach their AI goals, NVIDIA DGX Cloud on AWS offers a fully managed, high-performance AI training platform that provides flexible, short-term access to large-scale GPU clusters. Available in AWS Marketplace Private Offers with simplified procurement, this platform is designed to streamline and scale the development of advanced AI models, making the platform ideal for both established enterprises and startups seeking faster time to value.

Benefits

DGX Cloud on AWS offers:

Deliver enterprise-grade performance through direct access to NVIDIA's most advanced GPU clusters, state-of-the-art training and orchestration software, and AI expertise—all as a managed service. The platform is optimized for large-scale multi-node training, offering contiguous clusters, low latency, and high GPU utilization through built-in job scheduling and workload management. Built on NVIDIA GPU-accelerated Amazon Elastic Compute Cloud (Amazon EC2) instances, powered by the AWS Nitro System, it ensures continuous operation through live updates and intelligent hardware monitoring, delivering 99.99% infrastructure uptime combined with NVIDIA enterprise-grade software stack and NVIDIA AI Enterprise, included with DGX Cloud on AWS.

Leveraging the latest NVIDIA GPU architectures—Blackwell and Hopper—DGX Cloud on AWS accelerates training for large language models (LLMs) and generative AI workloads. Benefit from faster model training, reduced time-to-solution, and higher productivity from day one. Amazon EC2 instances, accelerated by NVIDIA Grace Blackwell Superchips and NVIDIA optimized software stack, deliver unprecedented AI training and inference performance.

Security is paramount, with AWS's comprehensive features including encrypted networking and secure data storage. The AWS Nitro System provides hardware-based security isolation and protection for data and model weights.

The platform seamlessly integrates with the AWS generative AI stack, enabling organizations to build sophisticated AI systems—from chatbots and code generators to autonomous AI agents. Customers can deploy their trained models on Amazon Bedrock, Amazon SageMaker AI, or Amazon Elastic Kubernetes Services (Amazon EKS), while leveraging NVIDIA NIM microservices for rapid deployment. As the world's most comprehensive and broadly adopted cloud, AWS offers large capacity NVIDIA GPU-powered AI accelerators, enabling customers to run their most demanding AI workloads at scale.

Infrastructure uptime delivered

Days saved for model training

cost savings when training models

Features

Highly portable

Train on DGX Cloud and take your artificial intelligence and machine learning pipelines anywhere on any service in the AWS environment. AWS customers can use their committed cloud spend agreements to purchase DGX Cloud and bring their models into Amazon Bedrock, Amazon SageMaker AI, or Amazon Elastic Kubernetes Services (Amazon EKS) for inference.

Increased productivity

Use a fully managed service to maximize GPU utilization and increase return on investment. NVIDIA delivers ready-to-go clusters. Customers on average see 86-100 percent GPU utilization.

Faster training

Train faster with enterprise-grade NVIDIA software and expertise. DGX Cloud comes preconfigured with accelerated libraries, GPU operators, and network operators that shorten time to train.

Full stack

DGX Cloud provides more than just compute. It comes with an entire stack of optimized networking, storage, high-performance compute, cloud-native Kubernetes, and enterprise-grade software and support. DGX Cloud combines the best of NVIDIA AI and brings it to AWS.

NVIDIA experts

Each customer has access to NVIDIA experts with a designated technical account manager (TAM). DGX Cloud also comes with 24/7 business-critical support.

Drive AI Innovation

DGX Cloud on AWS represents a significant leap forward in democratizing access to high-performance AI infrastructure. By combining NVIDIA GPU expertise with AWS scalable cloud services, organizations can accelerate their time-to-train, reduce operational complexity, and unlock new business opportunities. The platform's performance, security, and flexibility position it as a foundational element for those seeking to stay at the forefront of AI innovation.

Missing alt text value