AWS Cloud Operations Blog
Category: Compute
Salesforce Commerce Cloud migrates from Self-hosted Prometheus to Amazon Managed Service for Prometheus
Introduction Salesforce Commerce Cloud empowers thousands of retailers worldwide to create seamless shopping experiences. Behind these experiences lies a complex infrastructure that demands reliable monitoring at scale. As the platform evolved from static, first-party instances to dynamic cloud-based environments, the monitoring needs outgrew the self-managed Prometheus solution. This post details Salesforce’s Commerce Cloud journey from […]
Assess compliance and configuration of Kubernetes resources with AWS Config
Many customers today rely on AWS Config for recording configuration, tracking configuration history, and evaluating compliance of their AWS resources such as Amazon Elastic Compute Cloud (EC2) instances, Amazon Simple Storage Service (S3) buckets, and even Amazon Elastic Kubernetes Service (EKS) clusters. This provides them with a comprehensive view of their AWS infrastructure configuration state […]
Analyze Azure Audit Logs with CloudTrail Lake
Introduction In the ever-evolving world of cloud computing, maintaining robust security and compliance is paramount. As usage of multicloud environments grows, the need for comprehensive monitoring and logging solutions becomes more critical. Enter the synergy of Azure Audit Logs and AWS CloudTrail Lake—a powerful combination that provides comprehensive visibility across your cloud environments. Azure Audit […]
Operations transformation to navigate the VMware migration to AWS
IT operations are at the heart of every organization. Organizations leveraging VMware, have built and adapted to an operating model overtime that can seem daunting to migrate to the cloud. Amazon Web Services (AWS) migration impacts changes to your operations tooling, existing responsibility model, and operations processes tailored to their VMware environment. While AWS offers […]
Automate Systems Manager patching reports via email and Slack notifications in an AWS Organization
An effective patch management is foremost for maintaining system security, reliability, and compliance across your IT infrastructure. AWS Systems Manager (SSM) provides a comprehensive patching solution, enabling you to automate the deployment of operating system updates to your nodes deployed on AWS, on-premises, and multicloud environments. However, as your organization scales, tracking and reporting on […]
Troubleshooting AWS Systems Manager patching made easy with Amazon Bedrock’s automated recommendations
Keeping your AWS infrastructure up-to-date and secure is a critical part of maintaining a robust and reliable cloud environment. AWS Systems Manager’s patching capabilities are a powerful tool in this effort, allowing you to automatically apply the latest security updates and bug fixes to your managed nodes, including Amazon Elastic Compute Cloud (EC2) instances, on-premises […]
Monitor EBS Detailed Performance Statistics with Amazon Managed Service for Prometheus
Today we are excited to announce that you can now easily ingest Amazon EBS detailed performance statistics from your Amazon Elastic Kubernetes Service (Amazon EKS) workloads into an Amazon Managed Service for Prometheus workspace. We recently announced the availability of EBS detailed performance statistics, which gives you real-time visibility into the performance of your EBS […]
Manage AMI updates for AWS Auto Scaling groups with AWS Lambda and AWS Systems Manager
Keeping Amazon Machine Image (AMI) up-to-date with the latest patches and updates is a critical task for organizations using AWS Auto Scaling group . However, manually patching AMIs and updating Auto Scaling groups can be time-consuming for your teams and error-prone. This blog post presents a solution to automate the process of updating AMIs for […]
Introducing AWS Fault Injection Service Actions to Inject Chaos in Lambda functions
Usage of serverless technology in regulated industries like financial services is growing. This growth demands robust resilience validation. Chaos engineering for Serverless has become crucial for ensuring reliable and available serverless applications. By purposefully injecting failures and stresses into serverless components, teams can uncover hidden weaknesses and validate the fault tolerance of their systems. Previously, […]
Enable cloud operations workflows with generative AI using Agents for Amazon Bedrock and Amazon CloudWatch Logs
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible […]









