Overview
Cerebras Inference Cloud on AWS Marketplace brings lightning-fast, on-demand performance to the latest open-source LLMs, including Llama, Qwen, DeepSeek, Mistral, and more.
Built for real-time interactivity, multi-step reasoning, and complex agentic workflows, Cerebras delivers the speed, scale, and simplicity needed to go from API key to production in under 30 seconds.
Powered by the worlds fastest AI accelerator, the Wafer-Scale Engine (WSE), and the CS-3 system, the Cerebras Inference Cloud offers ultra-low latency and high-throughput inferencing via a drop-in, OpenAI-compatible API.
For custom pricing or other questions please contact partners@cerebras.netÂ
Highlights
- Up to 70X faster than GPUs: With throughput exceeding 2,500 tokens per second, Cerebras eliminates lag, delivering near-instant responses, even from large models.
- Full reasoning in under 1 second: No more multi-step delays, Cerebras executes full reasoning chains and delivers final answers in real time.
- Instant API access to top open-source models: Skip GPU setup and launch models like Llama, Qwen, DeepSeek, and Mistral in seconds, just bring your prompt.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Cost/month |
---|---|
Developer Tier | $0.001 |
Vendor refund policy
Payment obligations are non-cancelable once incurred, and Fees paid are non-refundable.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Software as a Service (SaaS)
SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.
Resources
Vendor resources
Support
Vendor support
Learn more about our supported models, rate limits, pricing, and more in our documentation (docs.cerebras.ai/cloud). For 24x7 technical support, contact support@cerebras.net or +1 (650) 933-4980.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.