Containers

Category: *Post Types

Beyond metrics: Extracting actionable insights from Amazon EKS with Amazon Q Business

In this post, we demonstrate a solution that uses Amazon Data Firehose to aggregate logs from the Amazon EKS control plane and data plane, and send them to Amazon Simple Storage Service (Amazon S3). Finally, we use Amazon Q Business and its Amazon S3 connector to synchronize the logs, index the log data in Amazon S3, and enable a chat experience powered by the generative AI capabilities of Amazon Q Business.

Monitor Amazon ECS Events with Amazon EventBridge Filtering

In this post, we demonstrate how to capture specific Amazon ECS events using EventBridge rules for enhanced monitoring and troubleshooting of your containerized applications. We show you how to customize EventBridge filtering patterns to capture the specific Amazon ECS events that matter for your troubleshooting and monitoring needs.

Streamline your containerized CI/CD with GitLab Runners and Amazon EKS Auto Mode

In this post we demonstrate how using GitLab Runners on EKS Auto Mode, combined with Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances, can deliver enterprise-scale CI/CD capabilities while achieving up to 90% cost reduction when compared to traditional deployment models. This approach not only optimizes operational expenses, but also provides resilient, scalable pipeline execution.

Amazon EKS introduces enhanced network policy capabilities

Today, we are excited to announce the expansion of native network policy support in Amazon EKS to include both Admin Policies and Application Network Policies. With these additional policies, Cluster Administrators (e.g. platform or security teams) can set cluster-wide security rules for their clusters to enhance the overall network security for their Kubernetes workloads. In […]

Amazon EKS introduces Provisioned Control Plane

Amazon EKS introduces Provisioned Control Plane, a new capability that allows you to pre-allocate control plane capacity for predictable, high-performance Kubernetes operations at scale. In this post, we explore how this enhanced option complements the Standard Control Plane by offering multiple scaling tiers (XL, 2XL, 4XL) with well-defined performance characteristics for API request concurrency, pod scheduling rates, and cluster database size—enabling you to handle demanding workloads like ultra-scale AI training, high-performance computing, and mission-critical applications with confidence.

Amazon EKS Blueprints for CDK: Now supporting Amazon EKS Auto Mode

Amazon EKS Blueprints for CDK now supports EKS Auto Mode, enabling developers to deploy fully managed Kubernetes clusters with minimal configuration while AWS automatically handles infrastructure provisioning, compute scaling, and core add-on management. In this post, we explore how this integration combines EKS Blueprints’ declarative infrastructure-as-code approach with EKS Auto Mode’s hands-off cluster operations, providing three practical deployment patterns—from basic clusters to specialized ARM-based and AI/ML workloads—that let teams focus on application development rather than infrastructure management .

Enhancing and monitoring network performance when running ML Inference on Amazon EKS

In this post, we explore how to enhance and monitor network performance for ML inference workloads running on Amazon EKS using the newly launched Container Network Observability feature. We demonstrate practical use cases through a sample Stable Diffusion image generation workload, showing how platform teams can visualize service communication, analyze traffic patterns, investigate latency issues, and identify network bottlenecks—ultimately improving metrics like inference latency and time to first token.

Data-driven Amazon EKS cost optimization: A practical guide to workload analysis

In this post, we introduce key considerations for optimizing Amazon EKS costs in production environments through detailed workload analysis and comprehensive monitoring. We demonstrate proven best practices to maximize cost savings while maintaining performance and resilience, supported by real-world examples showing how to eliminate resource waste from overprovisioned pods, excessive replica counts, and fragmented node pools.