DeepSeek R1 Distill Qwen 1.5B

A self-hosted production-ready DeepSeek-R1-Distill-Qwen-1.5B model (https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) running seamlessly in your private AWS cloud! With an easy single-click installation, set up all the essential infrastructure in your own cloud environment hassle-free. Plus, you will have quick access to an API endpoint that is ready for your queries and scales automatically based on your needs. Best of all, with the service operating solely in your cloud, your data remains completely secure and confidential, never leaving your private space. Experience peace of mind and unleash the full potential of DeepSeek R1 models today!

0 AWS reviews

View purchase options

Try for free

Overview

This service offers a hosted version of the DeepSeek-R1-Distill-Qwen-1.5B model (https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B ), which operates within your private cloud. After you subscribe to the listing, a CloudFormation deployment will initiate in your AWS account, setting up an EKS cluster running an inference service for the DeepSeek-R1-Distill-Qwen-1.5B model. Once the installation is complete, an API endpoint will be made available for seamless service queries!

DeepSeek-R1-Distill-Qwen-1.5B model is fine-tuned based on the open-source Qwen model, using samples generated by DeepSeek-R1. The DeepSeek team showed that the reasoning patterns discovered with reinforcement learning in a giant 671 B model can be compressed into tiny dense models without much loss. This 1.5 B checkpoint is the smallest of those distillations.

DeepSeek R1-Distill-Qwen-1.5 B punches way above its weight in math- and code-heavy reasoning while still fitting on a single laptop GPU (~4 GB in 8-bit). Use it whenever you need solid chain-of-thought performance under tight VRAM / latency budgets.

Architecture: 1.78 B-param decoder Transformer (Qwen 2.5-Math-1.5 B base) distilled from the 671 B-param DeepSeek R1 reasoning model

Context length: 32,768 tokens (inherits Qwen 2.5 long-context support)

Highlights

Privately hosted version of DeepSeek R1 1.5B model based on Qwen running securely on your cloud. Never worry about data leaving your cloud.
# Performance of Deepseek-R1-Distill-Qwen-1.5B on various benchmarks: - AIME 2024 pass@1 28.9 - AIME 2024 cons@64 52.7 - MATH-500 pass@1 83.9 - GPAQ Diamond pass@1 33.8 - LiveCodeBench pass@1 16.9 - CodeForces rating 954
DeepSeek R1-Distill-Qwen-1.5 B is a rare mix of tiny footprint and serious analytical power. Whenever your problem looks more like an Olympiad question or a LeetCode hard than a casual conversation and you only have laptop-grade hardware, then this is the model to load.

Details

Sold by

AUM Labs

Unlock automation with AI agent solutions

Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.

Explore AI agent solutions

Features and programs

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

Free trial

Try for free

Try this product free for 5 days according to the free trial terms set by the vendor. Usage-based pricing is in effect for usage beyond the free trial terms. Your free trial gets automatically converted to a paid subscription when the trial ends, but may be canceled any time before that.

DeepSeek R1 Distill Qwen 1.5B

Info

View purchase options

Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

Usage costs (12)

Info

Dimension	Cost/hour
g5.8xlarge	$0.10
g5.2xlarge	$0.10
p5e.48xlarge	$0.10
g5.4xlarge	$0.10
p5.48xlarge	$0.10
p4d.24xlarge	$0.10
g5.16xlarge	$0.10
g5.24xlarge	$0.10
g5.12xlarge	$0.10
p5en.48xlarge	$0.10

Vendor refund policy

contact support@aumlabs.ai

How can we make this page better?

We'd like to hear your feedback and ideas on how to improve this page.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Delivery details

Install DeepSeek as a Service stack via the CloudFormation template

Launch a production-ready DeepSeek-as-a-Service in minutes with this turnkey CloudFormation template. It automatically provisions an EKS cluster (with optional VPC creation), GPU-powered node group, secure ACM-validated domain, and a fully-configured Helm deployment of the DeepSeek model—no manual Kubernetes tinkering required. Simply deploy the stack and start serving blazing-fast, scalable generative-AI endpoints from your own AWS account.

Key capabilities

One-click deployment – spin up the entire stack in ~15 minutes without touching kubectl, Helm charts, or ACM.
GPU-optimized – node group is pre-sized for latency-critical inference and ships with NVIDIA’s device plugin.
Automatic HTTPS – a Lambda workflow requests an ACM certificate, validates DNS, and wires TLS to the load balancer; the issued customer domain is stored in SSM and pushed to your SaaS account via SNS.
Bring-your-own or new VPC – the template can search for a compatible private-subnet pair with NAT gateways, or build an isolated /16 VPC complete with public & private subnets, IGWs, NAT gateways, and route tables.
Self-cleaning bootstrap – a short-lived EC2 builder instance handles cluster configuration, Helm install, and signals CloudFormation, then terminates itself to avoid idle costs.
Full control – after launch you have full control of the cluster just like any EKS environment: scale nodes, roll images, or extend with additional micro-services.

What the template creates

(Optional) New IPv4 /16 VPC with two AZ-balanced public & private subnets, route tables, IGW, and redundant NAT gateways.
Amazon EKS cluster (v1.32) with dedicated control-plane security group and API server endpoints opened for HTTPS.
GPU node group (A10, A100 or H100 GPUs, AL2_x86_64_GPU AMI, 100 GB EBS).
Amazon ACM certificate for a unique sub-domain (e.g., .aws.aumlabs.ai) and an NLB configured with HTTP → HTTPS redirect and TLS offload.
Helm-based deployment (llm-inference namespace) of the DeepSeek container image.
AWS SSM Parameter /eks//customer-domain containing the final service URL, for easy CI/CD integration.
Helper Lambdas and IAM roles to automate VPC discovery, certificate issuance, secret propagation, and SNS notifications.

CloudFormation Template (CFT)

AWS CloudFormation templates are JSON or YAML-formatted text files that simplify provisioning and management on AWS. The templates describe the service or application architecture you want to deploy, and AWS CloudFormation uses those templates to provision and configure the required services (such as Amazon EC2 instances or Amazon RDS DB instances). The deployed application and associated resources are called a "stack."

Version release notes

DeepSeek R1 Distill Qwen 1.5B as a service:

Privately hosted version of DeepSeek R1 Distill Qwen 1.5B (https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B ) running in your cloud! Explore the full potential of DeepSeek R1 models without ever worrying about data leaving your cloud!

Additional details

Usage instructions

Follow the steps below to launch, configure, and run the DeepSeek LLM-Inference stack in your own AWS account.

Prerequisites

AWS account with permissions to create VPC, EKS, EC2 (g5/ p4/p5), ACM, IAM, SSM, and SNS resources.
Sufficient g5/p4/p5 GPU quota in two AZs of your chosen region. You need a quota of atleast 8 vCPUs to launch the stack on the cheapest available A10 GPU instance (g5.xlarge). You can request the quota increase at https://us-east-1.console.aws.amazon.com/servicequotas/home/services/ec2/quotas/L-DB2E81BA
(Optional) Existing VPC with >= 2 private subnets that have NAT-gateway egress if you plan to reuse your own network. Otherwise, the stack will automatically create a new VPC and subnets in your AWS account.

Subscribe & launch

Click Continue to Subscribe on the AWS Marketplace page and accept the terms.
Choose Continue to Configuration -> Continue to Launch.
Select the CloudFormation delivery option and choose the region in which you want to deploy.
Press Launch to open the stack in CloudFormation, then Next to the parameter wizard.

(Optional) Configure stack parameters

ClusterName - Friendly name for the EKS cluster (default llm-inference).
CreateVPC - true (build a new /16 VPC) or false (use an existing one automatically discovered) (default set to true). Leave the parameters at their defaults unless you have a specific reason to change them. Click Next, add optional tags, then Create stack.

Monitor deployment (~15 - 20 min)

In the CloudFormation console watch the Events tab; status will progress through CREATE_IN_PROGRESS -> CREATE_COMPLETE.
First, CloudFormation creates an EKS cluster.
Next, CloudFormation starts an EC2 instance to setup the EKS cluster. The EC2 instance installs Kubectl, Helm, Git, and the NVIDIA device plugin and sets up the helm chart on the EKS cluster. During this Phase, a GPU node joins the EKS cluster. Once the EKS cluster setup is finished, the EC2 instance self-terminates and signals success.
Finally, a validated ACM certificate is issued, allowing the load balancer on the EKS cluster to be accessed via a friendly URL.

Locate your endpoint when the stack deployment is complete:

Open the Outputs tab; copy the value of CustomerDomain (e.g., https://<your-unique-identifier>.aws.aumlabs.ai/docs).
The same URL is stored in SSM Parameter Store at /eks/<ClusterName>/customer-domain

Test the model Open https://<your-unique-identifier>.aws.aumlabs.ai/docs to access the auto-generated Swagger / OpenAPI UI, or invoke directly:

curl -X POST https://<your-unique-hash>.aws.aumlabs.ai/generate \ -H "Content-Type: application/json" \ -d '{ "prompts": ["Solve for x: 4x-9= 15 <think>\n"] }'

The API endpoint also supports batch calls. You can send a comma separated list of prompts.

curl -X POST https://<your-unique-hash>.aws.aumlabs.ai/generate \ -H "Content-Type: application/json" \ -d '{ "prompts": ["Solve for x: 4x-9= 15 <think>\n", "Find b, where 8^2 + b^2 = 17^2 <think>\n"] }'

Expect a JSON response containing the model's reply in the following format:

{ "responses": [ "<Reply to Query 1>", "<Reply to Query 2>" ] }

Operate & scale After launch you can manage the cluster just like any EKS environment: scale nodes, or extend with additional micro-services if needed. The service is already production ready so you don't need to make changes unless absolutely required.
Cleanup Delete the CloudFormation stack to remove all resources: EKS, GPU instances, load balancer, VPC (if created), ACM certificate, IAM roles, and SSM parameters. There is no leftover cost once the stack is gone.

This product does not collect or export customer data to external systems outside of your AWS account.

Need help or have feature requests? Use the Support tab on the AWS Marketplace listing or email support@aumlabs.ai .

Support

Vendor support

support@aumlabs.ai

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Similar products

DeepSeek AI Self-Hosted Large Language Model by Optick

By Optick

This product has charges associated with it for the provision and deployment of the application and AMI support. Unlock the full potential of DeepSeek AI by self hosting it on your own infrastructure. This powerful large language model offers advanced reasoning, coding, and mathematical capabilities, making it an ideal solution for AI enthusiasts, developers, and enterprises looking for privacy, speed, and scalability without reliance on third-party providers.

View product

DeepSeek-R1 on Ubuntu 24 | support by Gigabits

By Gigabits

This product includes additional charges for seller support. Deep Seek AI is a cutting-edge artificial intelligence platform designed to power advanced data analysis and insights at scale. Root partition and filesystem automatically expands at boot for volumes larger than 8 GiB, for seamless scalability. Preconfigured with Cloud-init and ENA support for enhanced network performance. With robust machine learning models, real-time analytics, and intuitive dashboards, Deep Seek AI helps businesses uncover hidden patterns, optimize processes, and make data-driven decisions effortlessly. Ideal for industries like finance, healthcare, and e-commerce, it seamlessly integrates with your existing cloud infrastructure, delivering scalable, secure, and actionable intelligence. Accelerate your innovation journey with Deep Seek AI, where complexity meets clarity

View product

GPU Supported DeepSeek & Llama powered All-in-One LLM Suite

By Techlatest.net

This product has charges associated with it for seller support. Run & Manage latest LLMs locally, privately, securely and cost-effectively without any vendor lock-in. This VM solution comes with GPU support , pre-loaded with LLaMA, Mistral, Gemma, DeepSeek, & Qwen models along with Open-WebUI as an intuitive UI to interact with the LLMs and Ollama to install new models as needed.

View product

DeepSeek & Llama powered All-in-One LLM Suite

By Techlatest.net

This product has charges associated with it for seller support. Run & Manage latest LLMs locally, privately, securely and cost-effectively without any vendor lock-in. This VM solution comes pre-loaded with LLaMA, Mistral, Gemma, DeepSeek, & Qwen models along with Open-WebUI as an intuitive UI to interact with the LLMs and Ollama to install new models as needed.

View product

DeepSeek-R1 with Open WebUI | Support by SupportedImages

By Supported Images

This product has charges associated with it for seller support. DeepSeek-R1 with Open WebUI is a powerful Amazon Machine Image (AMI) designed for advanced web data extraction and analysis. Built to seamlessly integrate with the EC2 cloud, it allows users to leverage a robust framework for developing and deploying web scraping applications. Featuring an intuitive Ideal for data scientists and developers, DeepSeek-R1 is well-suited for use cases ranging from competitive analysis and market research to academic studies and machine learning model training. DeepSeek R1 with Open WebUI is an open-source AI model offering advanced reasoning capabilities, comparable to OpenAI's GPT series.

View product

Customer reviews

Leave a review

Ratings and reviews

Info

0 ratings

5 star

4 star

3 star

2 star

1 star

0 AWS reviews

No customer reviews yet

Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.