Overview
Friendli Container delivers FriendliAI's high-performance inference engine as a portable, production-ready Docker container that runs on your infrastructure-whether in Amazon EKS, private cloud, or on-premise environments. It enables enterprises to securely serve generative AI models and custom fine-tunes with up to 3x faster output and 50-90% lower GPU usage compared to open-source stacks. Designed for organizations with strict data security, compliance, and performance requirements, Friendli Container supports full VPC isolation. It integrates natively with Prometheus and Grafana for real-time observability, allowing teams to track throughput, latency, TTFT, and caching efficiency at production scale. Friendli's patented technologies-like Continuous Batching™, Friendli TCache, and optimized quantization-enable extremely low-latency inference without sacrificing model quality. Typical use cases include deploying multimodal generative AI models behind the firewall to meet regulatory or privacy constraints, running RAG-enhanced enterprise search systems, serving fine-tuned models for customer service or agent tasks, and powering AI APIs that demand sub-second response times under fluctuating load. Friendli Container comes with a high-level Kubernetes CRD(Custom Resource Definition), so launching an inference endpoint is as easy as launching a Deployment in Kubernetes. If you need fast, flexible, and secure generative AI inference across text, image, and code without over-provisioning GPUs or stitching together infrastructure, Friendli Container is your drop-in solution for scalable, cost-effective AI deployment.
Highlights
- Blazing-Fast Output Speeds: Fastest token generation among GPU-based providers, powered by patented continuous batching and speculative decoding
- 50-90% GPU Reduction: Serve the same workload with a fraction of the GPU footprint-cutting cost and idle resource waste
- Integrated Observability: Friendli Container exposes real-time metrics using the Prometheus text-format, unlocking useful insights into latency, throughput, cache hit ratios, and more
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/month |
---|---|---|
A100 | License to run Friendli Container per one NVIDIA A100. | $730.00 |
H100 | License to run Friendli Container per one NVIDIA H100. | $730.00 |
A10G | License to run Friendli Container per one NVIDIA A10G. | $100.00 |
L4 | License to run Friendli Container per one NVIDIA L4. | $100.00 |
Vendor refund policy
We do not support any refunds currently.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Friendli Container v0.2.0 as Amazon EKS add-on
- Amazon EKS
EKS add-on
An add-on is software that provides supporting operational capabilities to Kubernetes applications but isn't specific to the application. This includes software like observability agents or Kubernetes drivers that allow the cluster to interact with underlying AWS resources for networking, compute, and storage. Add-on software is typically built and maintained by the Kubernetes community, cloud providers like AWS, or third-party vendors. Amazon EKS add-ons provide installation and management of a curated set of add-ons for Amazon EKS clusters. All Amazon EKS add-ons include the latest security patches and bug fixes, and are validated by AWS to work with Amazon EKS. Amazon EKS add-ons allow you to consistently ensure that your Amazon EKS clusters are secure and stable and reduce the amount of work that you need to do to install, configure, and update add-ons.
Version release notes
CRDs were updated to provide more control of deployment while simplifying the initial setup process.
- FriendliConfig is now an optional resource for creating FriendliDeployment.
- FriendliConfig: added field: InferenceServicePort
- FriendliDeployment: added field: ServiceAccountName, DeploymentStrategy
Additional details
Usage instructions
Detailed instructions on configuring and using can be found here: https://github.com/friendliai/examples/tree/main/aws/eks-addonÂ
Support
Vendor support
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products
