AWS Cloud Operations Blog
Simplify AWS Control Tower governance with enhanced AWS CloudFormation Hooks
Introduction Organizations using AWS Control Tower to govern their multi-account environments face a persistent challenge: when AWS CloudFormation deployments fail due to proactive control violations, teams receive minimal information about why the failure occurred or how to fix it. This lack of visibility leads to: Delayed deployments as developers struggle to understand cryptic error messages […]
Deploying custom Terraform to LZA-Managed Accounts with AFT
As organizations scale their AWS environments, managing infrastructure consistently while enabling team autonomy becomes increasingly challenging. Landing Zone Accelerator on AWS (LZA) and AWS Account Factory for Terraform (AFT) both extend AWS Control Tower to help customers manage AWS environments at scale, offering complementary strengths. Many AWS customers struggle to balance centralized security governance with […]
Optimize cost and automate security remediation with AMS Trusted Remediator
Organizations leveraging Amazon Web Services (AWS) receive thousands of security and optimization recommendations monthly, yet many remain unimplemented due to competing priorities and resource constraints. AWS Managed Services (AMS) Trusted Remediator addresses this challenge by automating remediation across AWS accounts, significantly reducing the time and effort required for manual remediation processes. The solution features a continuously expanding library of pre-built remediations […]
Innovation sandbox on AWS with real-time analytics dashboard
How do you deploy hundreds of AWS accounts for a large-scale hackathon? Provide real-time visibility to leadership? Enable participant self-service while monitoring spending across accounts? Enterprise innovation events often lack real-time visibility into participant engagement, resource utilization, and outcomes. Leaders can’t see engagement metrics; builders can’t access accounts and information on-demand. Without observability and governance, […]
Investigating Service Issues with Amazon CloudWatch Application Signals Custom Metrics
When a critical service fails, you need to know how much revenue you’re losing, not just that latency has increased. This post shows you how to integrate business metrics with CloudWatch Application Signals to see both technical performance and business impact in one unified view. With CloudWatch Application Signals, you can view metrics, traces, and […]
Cross-Region AWS PrivateLink monitoring with Amazon CloudWatch Network Synthetic Monitor
Introduction Global, distributed AWS architectures are the backbone for customers seeking high availability, resilience, and regulatory compliance. Workloads are commonly deployed across multiple AWS Regions and Availability Zones (AZs), often using AWS PrivateLink to connect services securely and privately across Amazon Virtual Private Cloud (Amazon VPC) networks. This approach enhances security and separation while requiring […]
Alerting Best Practices with Amazon Managed Service for Prometheus
Introduction Alerts connect telemetry to action. Effective alert management helps you detect problems quickly, maintain resilience, and build customer trust. So, what is the best way to manage alerts when storing metrics in Amazon Managed Service for Prometheus? In this blog post, you will learn how to create, route, and administrate alerting rules in Amazon […]
Search and discover governance controls with Control Catalog in AWS Control Tower
As you scale your AWS environment from hundreds to thousands of AWS accounts, maintaining consistent governance standards across this expanded infrastructure requires a strategic approach. Governance controls—the automated policies and rules that enforce standards across your cloud infrastructure—are essential for managing this scale, but implementing them presents two fundamental challenges. First, without proper controls, a […]
Resolve application issues autonomously with AWS DevOps Agent (Preview) and Dynatrace
Application issues require fast resolution to maintain business continuity and customer satisfaction, but manual investigation creates delays that can cost organizations significantly in lost revenue and productivity. Last week, we launched AWS DevOps Agent (Preview), a frontier agent that resolves and proactively prevents incidents, continuously improving reliability and performance of applications in AWS, multicloud, and […]
Troubleshoot AWS Tagging Compliance with AWS Resource Explorer
With AWS Resource Explorer’s immediate resource discovery launch on October 13, 2025, customers can now discover resources from their very first search in Unified Search in the AWS Management Console or the Resource Explorer console. Operations like troubleshooting and problem resolution, making resource changes, investigating resource dependencies, identifying security risks, and optimizing costs are critical […]








