dstack Standard

dstack is a streamlined alternative to Kubernetes and Slurm, designed to simplify development and deployment of AI. It works with top cloud providers and on-prem servers.

0 AWS reviews

View purchase options

Overview

dstack is a streamlined alternative to Kubernetes and Slurm, specifically designed for AI. It simplifies container orchestration for AI workloads both in the cloud and on-prem, speeding up the development, training, and deployment of AI models.

dstack is easy to use with any cloud providers as well as on-prem servers.

dstack supports NVIDIA GPU, AMD GPU, and Google Cloud TPU out of the box.

Highlights

dstack is a streamlined alternative to Kubernetes and Slurm, designed to simplify the development and deployment of AI.
It simplifies container orchestration for AI workloads across multiple clouds and on-prem, speeding up the development, training, and deployment of AI models.
dstack enables AI teams to work with any tools, frameworks, and hardware across multiple cloud platforms and on-premises.

Details

Sold by

dstack

Unlock automation with AI agent solutions

Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.

Explore AI agent solutions

Features and programs

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

dstack Standard

Info

View purchase options

Pricing is based on a fixed subscription cost. You pay the same amount each billing period for unlimited usage of the product. Pricing is prorated, so you're only charged for the number of days you've been subscribed. Subscriptions have no end date and may be canceled any time.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

Fixed subscription cost

Info

Monthly subscription: $3,000.00/month

Vendor refund policy

No refund

How can we make this page better?

We'd like to hear your feedback and ideas on how to improve this page.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Delivery details

Container image

Supported services: Learn more

Amazon EKS Anywhere
Amazon ECS
Amazon EKS
Amazon ECS Anywhere

Container image

Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.

Version release notes

Clusters

Simplified use of MPI

startup_order and stop_criteria

New run configuration properties are introduced:

startup_order: any/master-first/workers-first specifies the order in which master and workers jobs are started.
stop_criteria: all-done/master-done specifies the criteria when a multi-node run should be considered finished.

These properties simplify running certain multi-node workloads. For example, MPI requires that workers are up and running when the master runs mpirun, so you'd use startup_order: workers-first. MPI workload can be considered done when the master is done, so you'd use stop_criteria: master-done and dstack won't wait for workers to exit.

DSTACK_MPI_HOSTFILE

dstack now automatically creates an MPI hostfile and exposes the DSTACK_MPI_HOSTFILE environment variable with the hostfile path. It can be used directly as mpirun --hostfile $DSTACK_MPI_HOSTFILE.

CLI

We've also updated how the CLI displays run and job status. Previously, the CLI displayed the internal status code which was hard to interpret. Now, the the STATUS column in dstack ps and dstack apply displays a status code which is easy to understand why run or job was terminated.

dstack ps -n 10 NAME BACKEND RESOURCES PRICE STATUS SUBMITTED oom-task no offers yesterday oom-task nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 exited (127) yesterday oom-task nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 exited (127) yesterday heavy-wolverine-1 done yesterday replica=0 job=0 aws (us-east-1) cpu=4 mem=16GB disk=100GB T4:16GB:1 $0.526 exited (0) yesterday replica=0 job=1 aws (us-east-1) cpu=4 mem=16GB disk=100GB T4:16GB:1 $0.526 exited (0) yesterday cursor nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 stopped yesterday cursor nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 error yesterday cursor nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 interrupted yesterday cursor nebius (eu-north1) cpu=2 mem=8GB disk=100GB $0.0496 aborted yesterday

Examples

Simplified NCCL tests

With this release improvements, it became much easier to run MPI workloads with dstack. This includes NCCL tests that can now be run using the following configuration:

type: task name: nccl-tests nodes: 2 startup_order: workers-first stop_criteria: master-done image: dstackai/efa env: - NCCL_DEBUG=INFO commands: - cd /root/nccl-tests/build - | if [ ${DSTACK_NODE_RANK} -eq 0 ]; then mpirun \ --allow-run-as-root --hostfile $DSTACK_MPI_HOSTFILE \ -n ${DSTACK_GPUS_NUM} \ -N ${DSTACK_GPUS_PER_NODE} \ --mca btl_tcp_if_exclude lo,docker0 \ --bind-to none \ ./all_reduce_perf -b 8 -e 8G -f 2 -g 1 else sleep infinity fi resources: gpu: nvidia:4:16GB shm_size: 16GB

See the updated NCCL tests example for more details.

Distributed training

TRL

The new TRL example walks you through how to run distributed fine-tune using TRL , Accelerate and Deepspeed .

Axolotl

The new Axolotl example walks you through how to run distributed fine-tune using Axolotl with dstack.

What's changed

[Feature] Update .gitignore logic to catch more cases by @colinjc in https://github.com/dstackai/dstack/pull/2695
[Bug] Increase upload_code client timeout by @r4victor in https://github.com/dstackai/dstack/pull/2709
[Bug] Fix missing apt-get update by @r4victor in https://github.com/dstackai/dstack/pull/2710
[Internal]: Update git hooks and package.json by @olgenn in https://github.com/dstackai/dstack/pull/2706
[Examples] Add distributed Axolotl and TRL example by @Bihan in https://github.com/dstackai/dstack/pull/2703
[Docs] Update dstack-proxy contributing guide by @jvstme in https://github.com/dstackai/dstack/pull/2683
[Feature] Implement DSTACK_MPI_HOSTFILE by @r4victor in https://github.com/dstackai/dstack/pull/2718
[Feature] Implement startup_order and stop_criteria by @r4victor in https://github.com/dstackai/dstack/pull/2714
[Bug] Fix CLI exiting while master starting by @r4victor in https://github.com/dstackai/dstack/pull/2720
[Examples] Simplify NCCL tests example by @r4victor in https://github.com/dstackai/dstack/pull/2723
[Examples] Update TRL Single Node example to uv by @Bihan in https://github.com/dstackai/dstack/pull/2715
[Bug] Fix backward compatibility when creating fleets by @jvstme in https://github.com/dstackai/dstack/pull/2727
[UX]: Make run status in UI and CLI easier to understand by @peterschmidt85 in https://github.com/dstackai/dstack/pull/2716
[Bug] Fix relative paths in dstack apply --repo by @jvstme in https://github.com/dstackai/dstack/pull/2733
[Internal]: Drop hardcoded regions from the backend template by @jvstme in https://github.com/dstackai/dstack/pull/2734
[Internal]: Update backend template to match ruff formatting by @jvstme in https://github.com/dstackai/dstack/pull/2735

Full changelog: https://github.com/dstackai/dstack/compare/0.19.11...0.19.12

Additional details

Usage instructions

Here's the most simple way to run the container image:

aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 709825985650.dkr.ecr.us-east-1.amazonaws.com

Pull the container image:

docker pull 709825985650.dkr.ecr.us-east-1.amazonaws.com/dstack/dstack-enterprise:0.19.12-v1

Run the Container Image

docker run -it -p 3000:3000 -v $HOME/.dstack-enterprise/server/:/root/.dstack/server 709825985650.dkr.ecr.us-east-1.amazonaws.com/dstack/dstack-enterprise:0.19.12-v1

Click the URL in the container output (e.g., http://localhost:3000 ).
Copy the admin token from the container output to log in to the UI

For more advanced deployment configurations, check https://dstack.ai/docs/guides/server-deployment/

dstack Standard is fully compatible with the open-source CLI of dstack. More details can be found at dstack documentation: https://dstack.ai/docs/ .

Support

Vendor support

hello@dstack.ai

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Similar products

dstack Enterprise

By dstack

dstack is a streamlined alternative to Kubernetes and Slurm, designed to simplify development and deployment of AI. It works with top cloud providers and on-prem servers.

View product

Customer reviews

Leave a review

Ratings and reviews

Info

0 ratings

5 star

4 star

3 star

2 star

1 star

0 AWS reviews

No customer reviews yet

Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.