Overview
dstack is a streamlined alternative to Kubernetes and Slurm, specifically designed for AI. It simplifies container orchestration for AI workloads both in the cloud and on-prem, speeding up the development, training, and deployment of AI models.
dstack is easy to use with any cloud providers as well as on-prem servers.
dstack supports NVIDIA GPU, AMD GPU, and Google Cloud TPU out of the box.
Highlights
- dstack is a streamlined alternative to Kubernetes and Slurm, designed to simplify the development and deployment of AI.
- It simplifies container orchestration for AI workloads across multiple clouds and on-prem, speeding up the development, training, and deployment of AI models.
- dstack enables AI teams to work with any tools, frameworks, and hardware across multiple cloud platforms and on-premises.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
- $3,000.00/month
Vendor refund policy
No refund
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Container image
- Amazon EKS Anywhere
- Amazon ECS
- Amazon EKS
- Amazon ECS Anywhere
Container image
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
Clusters
Simplified use of MPI
startup_order and stop_criteria
New run configuration properties are introduced:
- startup_order: any/master-first/workers-first specifies the order in which master and workers jobs are started.
- stop_criteria: all-done/master-done specifies the criteria when a multi-node run should be considered finished.
These properties simplify running certain multi-node workloads. For example, MPI requires that workers are up and running when the master runs mpirun, so you'd use startup_order: workers-first. MPI workload can be considered done when the master is done, so you'd use stop_criteria: master-done and dstack won't wait for workers to exit.
DSTACK_MPI_HOSTFILE
dstack now automatically creates an MPI hostfile and exposes the DSTACK_MPI_HOSTFILE environment variable with the hostfile path. It can be used directly as mpirun --hostfile $DSTACK_MPI_HOSTFILE.
CLI
We've also updated how the CLI displays run and job status. Previously, the CLI displayed the internal status code which was hard to interpret. Now, the the STATUS column in dstack ps and dstack apply displays a status code which is easy to understand why run or job was terminated.
dstack ps -n 10 NAME BACKEND RESOURCES PRICE STATUS SUBMITTED oom-task no offers yesterday oom-task nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 exited (127) yesterday oom-task nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 exited (127) yesterday heavy-wolverine-1 done yesterday replica=0 job=0 aws (us-east-1) cpu=4 mem=16GB disk=100GB T4:16GB:1 $0.526 exited (0) yesterday replica=0 job=1 aws (us-east-1) cpu=4 mem=16GB disk=100GB T4:16GB:1 $0.526 exited (0) yesterday cursor nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 stopped yesterday cursor nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 error yesterday cursor nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 interrupted yesterday cursor nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 aborted yesterdayExamples
Simplified NCCL tests
With this release improvements, it became much easier to run MPI workloads with dstack. This includes NCCL tests that can now be run using the following configuration:
type: task name: nccl-tests nodes: 2 startup_order: workers-first stop_criteria: master-done image: dstackai/efa env: - NCCL_DEBUG=INFO commands: - cd /root/nccl-tests/build - | if [ ${DSTACK_NODE_RANK} -eq 0 ]; then mpirun \ --allow-run-as-root --hostfile $DSTACK_MPI_HOSTFILE \ -n ${DSTACK_GPUS_NUM} \ -N ${DSTACK_GPUS_PER_NODE} \ --mca btl_tcp_if_exclude lo,docker0 \ --bind-to none \ ./all_reduce_perf -b 8 -e 8G -f 2 -g 1 else sleep infinity fi resources: gpu: nvidia:4:16GB shm_size: 16GBSee the updated NCCL tests example for more details.
Distributed training
TRL
The new TRL example