Operationalizing AWS security: A maturity roadmap

Enabling security tooling is the starting point. Making it operational—where findings drive decisions, response times are measurable, and your security posture improves week over week—is where most organizations struggle.

This blog post provides a phased maturity roadmap for organizations that have already enabled AWS Security Hub and Amazon GuardDuty. These two services form the foundation of a cloud-centered security operations capability on AWS. Security Hub provides centralized security posture management and aggregates findings from multiple AWS security services, while GuardDuty provides intelligent threat detection by continuously monitoring for malicious activity and unauthorized behavior. For any production or enterprise AWS environment, having both services enabled across all accounts and AWS Regions is a baseline expectation; not because they’re optional add-ons, but because effective security operations require both the ability to detect threats and the ability to understand your overall security posture. If you haven’t yet enabled them, the Security Hub documentation and GuardDuty documentation provide setup guidance, including multi-account deployment with AWS Organizations.

Customers consistently tell us that while individual AWS security service documentation is thorough, what’s missing is a consolidated operational playbook—one resource that ties the services together into a working security operations practice with clear phases, progression criteria, and an operational cadence. That’s the gap this post fills. Rather than covering how each feature works (the documentation does that well), this post focuses on when and why to use each capability, and how to build the organizational habits that make them effective.

What follows is a six-phase roadmap for moving from these services are active to these services are driving our security operations. Each phase builds on the previous one, and each is designed to deliver tangible, measurable improvement.

Phase 0: Assess your current state

Goal: Understand what’s working before changing anything.

Estimated timeline: 1–2 weeks

Move to Phase 1 when: You have a documented current-state assessment covering all the following items.

Before introducing new processes or automation, establish a clear picture of the current environment. This assessment informs every decision that follows.

Actions:

Findings inventory: Review existing active GuardDuty findings to determine how many there are, the severity distribution, and how old the oldest findings are. A large backlog of untouched HIGH or CRITICAL findings that have been sitting for weeks is a strong signal about where to focus first.
Security Hub score baseline: Determine your current compliance score against AWS Foundational Security Best Practices (FSBP) and The CIS AWS Foundations Benchmark. Check to see which standards are enabled; if multiple standards are enabled, review for overlapping standards (creating noise) or unused standards.
Multi-account and multi-Region check: Look to see if GuardDuty is enabled in every account and every Region, or only in Regions with active workloads. Threat actors frequently operate in Regions that organizations don’t actively monitor. Also check to see if Security Hub aggregation is configured with a delegated administrator account or if each account is being managed independently.
Integration check: Determine if GuardDuty findings are flowing into Security Hub and if Amazon Inspector and Amazon Macie are enabled and feeding findings in. Without integration, Security Hub might be only surfacing its own compliance checks.
Notification check: See if there’s an Amazon EventBridge rule configured for notifications and if so, how findings are being routed and to whom. Know if notifications are being sent using an Amazon Simple Notification Service (Amazon SNS) topic or a chat channel integration. Without a clear notification and response workflow, findings can accumulate silently in the console with no one looking at them.

Deliverable: A one-page current state assessment that identifies what’s enabled, what’s flowing where, who’s looking at it, and what’s in the existing backlog.

Phase 1: Reduce the noise

Goal: Make the signal meaningful before asking anyone to act on it.

Estimated timeline: 2–3 weeks

Move to Phase 2 when: Remaining findings represent items requiring real decisions, compliance scores reflect actual posture, and you can articulate why every suppression rule and disabled control exists.

This is the single most important phase. If this step is skipped in favor of jumping straight to automation, the result is automated chaos. Alert fatigue is the primary reason security tooling is ignored, and addressing it first is what makes everything that follows sustainable.

GuardDuty tuning:

Create suppression rules for known-benign findings. The goal is to suppress activity you’ve already evaluated and accepted—such as expected traffic from corporate egress IPs (based on trusted IP lists), internal tools that trigger DNS-based findings, or internet-facing resources that naturally receive port scanning. The principle: if you’ve investigated a finding and it’s expected, suppress it so your team can focus on what matters.
Triage every active HIGH and CRITICAL finding into three categories: needs immediate investigation (real threat, not yet reviewed), true positive, already addressed (archive using workflow status), or false positive or expected behavior (create a suppression rule). Every finding must be categorized into one of these three states.
Review GuardDuty protection plans and enable any that are relevant but not yet active. Organizations that enabled GuardDuty years ago might not have activated protection plans released since then (such as Runtime Monitoring, Malware Protection, RDS Protection, and Lambda Protection). Evaluate each against your workload profile and enable what applies.

Security Hub tuning:

Disable controls that aren’t relevant to the environment. This is the highest-value quick win. If a service isn’t in use, disable its controls. If a control is addressed by an alternative solution, disable it. A 47% compliance score where half the failures are irrelevant trains teams to ignore the dashboard entirely. See the Security Hub controls reference for the full list.
Choose a primary standard. AWS Foundational Security Best Practices is a strong default. The CIS AWS Foundations Benchmark adds value when there’s a specific compliance mandate. Avoid enabling PCI DSS or NIST 800-53 standards unless there’s a reporting requirement—they add significant volume without proportional signal for most organizations.
Configure cross-Region aggregation to the delegated administrator account if not already in place. A single aggregated view eliminates the need to check findings across multiple Regional consoles.
Use the workflow status field operationally. Findings should progress from NEW to NOTIFIED to RESOLVED or SUPPRESSED. If everything remains in NEW indefinitely, the system carries no operational meaning.

Deliverable: A tuned environment where remaining findings represent items that require real decisions. Compliance scores should now reflect your organization’s actual security posture rather than noise.

Phase 2: Build the notification and routing layer

Goal: Get the right findings to the right people at the right time.

Estimated timeline: 2–3 weeks

Move to Phase 3 when: CRITICAL and HIGH findings reach the security team within minutes, MEDIUM findings create tracked tickets, and notifications include enriched context. No action is taken until a person or an automation is informed that something needs attention.

Architecture: Security Hub to EventBridge rule to routing logic to destination

Tiered notification strategy:

CRITICAL	Page on-call immediately	PagerDuty or Opsgenie	15 minutes
HIGH	Alert security team channel	Slack or Teams channel and ticket creation	4 hours
MEDIUM	Create ticket for review	Jira or ServiceNow	48 hours
LOW or INFORMATIONAL	Batch digest	Weekly email summary or dashboard review	Next review cycle

Key design decisions:

Route from Security Hub, not individual services. Because findings from GuardDuty, Inspector, Macie, and Security Hub compliance checks all aggregate in Security Hub, build your EventBridge rules there for centralized management.
Create a fast path for the most dangerous finding types. Certain GuardDuty findings, particularly those involving credential exfiltration, cryptocurrency activity, trojans, and active compromises, warrant a separate, faster routing path that bypasses normal triage. Identify these based on your threat model and the GuardDuty finding types reference.
Enrich notifications before delivery. A raw JSON finding in a chat channel provides little actionable context. Use an AWS Lambda function to format notifications with the information responders need: account alias, Region, Amazon Resource Name (ARN), finding type, severity, a console deep link, and a plain-language description. The Security Hub CloudWatch Events integration guide describes the event format.

Deliverable: A working notification pipeline where CRITICAL and HIGH findings reach the security team within minutes, MEDIUM findings create tracked work items, and LOW and INFORMATIONAL findings are batched for periodic review.

Phase 3: Build automated remediation for high-confidence findings

Goal: For findings where the correct response is deterministic, remove the human from the loop.

Estimated timeline: 3–4 weeks

Move to Phase 4 when: At least 3–5 high-confidence finding types have automated responses deployed with audit trails, and the team has established a process for evaluating new auto-remediation candidates.

The guiding principle: Only auto-remediate when all three conditions are met: the finding is high-confidence, the response is deterministic, and the blast radius of the automated action is limited. Automated remediation must not create the risk of a production outage.

Decision framework:

Confidence level	High – no false positive risk	Medium – context-dependent	Low – requires investigation
Response complexity	Single, well-defined action	Multiple steps or judgment calls	Requires forensic analysis
Blast radius	Limited to one resource	Could affect dependent services	Production-wide impact
Rollback difficulty	Straightforward to reverse	Moderate effort to reverse	Difficult or impossible to reverse

Common auto-remediation categories:

Instance isolation for confirmed compromise findings (cryptocurrency mining, malware, and trojans): Replace the security group, snapshot volumes for forensics, and notify.
Credential revocation for confirmed credential compromise: Attach deny-all policies, revoke sessions, and deactivate access keys as appropriate to the credential type.
Compliance drift correction for deterministic misconfigurations: Re-enable Amazon Simple Storage Service (Amazon S3) Block Public Access, revoke overly permissive security group rules, and re-enable AWS CloudTrail logging.
Notification-only escalation for findings that require human judgment before action: Amazon Elastic Block Store (Amazon EBS) encryption gaps (require migration) and access key rotation (requires coordination with the key owner).

For implementation, AWS provides Security Hub Automated Response and Remediation (SHARR), a solution that includes pre-built remediation playbooks deployed as AWS Step Functions workflows triggered by EventBridge. This is a strong starting point—evaluate the provided playbooks, enable the ones that fit, and extend with custom remediations as needed.

Note: For findings that recur because the environment lacks preventive guardrails, the best long-term response is often a service control policy (SCP) that prevents the misconfiguration from occurring in the first place. Phase 5 covers this preventive controls layer.

Deliverable: A library of automated and semi-automated remediation runbooks with full audit trails, and a documented decision framework the team uses to evaluate new auto-remediation candidates.

Phase 4: Build the operational rhythm

Goal: Turn security findings management into a sustained organizational practice, not a one-time cleanup.

Estimated timeline: 4–6 weeks to establish, then ongoing

Move to Phase 5 when: The weekly cadence has been running consistently for at least 8 weeks, monthly metrics show positive trends, and the first quarterly review has been completed.

This is where many organizations stall, and it’s the most important phase in the entire roadmap. The technology is working, the notifications are flowing, automated remediations are firing, but there’s no organizational habit built around it. Without this phase, everything you’ve built in Phases 0–3 will gradually decay. Suppression rules will go stale, new team members won’t know the system exists, and findings will start accumulating again. The operational rhythm is what converts a security tooling deployment into a security operations practice.

Weekly security review (30 minutes)

Attendees: Security team lead, cloud platform team representative, rotating engineering lead from an application team

Why the rotating engineering lead matters: Security findings don’t exist in a vacuum; they’re generated by workloads that engineering teams own. Rotating an engineering representative through this meeting accomplishes three things: it builds security awareness across the organization, ensures findings are routed to people with the context to resolve them, and creates organizational accountability beyond the security team.

Agenda template:

5 minutes	Compliance score trend – Review Security Hub scores by account and standard. Is the trend improving, declining, or flat? If declining, why?	Security lead	Identified regression areas
5 minutes	Critical and high findings review – Walk through new HIGH and CRITICAL GuardDuty findings from the past week. Are there any that need immediate escalation?	Security lead	Escalation actions assigned
10 minutes	Top five failing controls – Identify the five Security Hub controls with the most failures. Assign an owner and a target date for each.	Platform lead	Owners and dates documented
5 minutes	Automation review – Did any auto-remediations fire this week? Did they work correctly? Were there any false triggers?	Security lead	Automation adjustments queued
5 minutes	Tuning decisions – Are new suppression rules needed based on this week’s findings? Are any new finding types candidates for auto-remediation?	All	Tuning backlog updated

Running the meeting effectively:

Keep a running document (such as a wiki page or shared document) that captures decisions and action items week over week. This becomes your institutional memory.
If the compliance score hasn’t moved in over 3 weeks, that’s a signal. Either the assigned work isn’t happening, or the remaining findings are genuinely difficult to address. Both need to be discussed.
Track action items from previous weeks. A review that generates action items but never follows up on them will lose credibility and attendance quickly.

Escalation procedures

Define clear escalation paths before they’re needed:

CRITICAL finding not acknowledged within the SLA	Auto-escalate to security team manager	15 minutes after SLA breach
HIGH finding not resolved within the SLA	Escalate to finding owner’s manager	4 hours after SLA breach
Compliance score drops more than 5 points in a week	Escalate to cloud platform team lead for investigation	Next business day
Auto-remediation failure	Page security on-call	Immediate
New finding type not covered by existing runbooks	Add to weekly review agenda for triage and runbook development	Next weekly review

Monthly metrics report

Compile these metrics monthly and review them with security and engineering leadership. The goal is to tell a story about whether the organization’s security posture is improving, stable, or degrading, and why.

Mean time to acknowledge (MTTA) for CRITICAL findings	Are findings being seen promptly?	Decreasing month over month
Mean time to resolve (MTTR) for CRITICAL and HIGH findings	Are findings being acted on?	Decreasing month over month
Security Hub compliance score by standard, by account	What is the posture trend over time?	Increasing month over month
Number of active GuardDuty findings by severity	Is the backlog growing or shrinking?	Decreasing for HIGH and CRITICAL
Findings auto-remediated compared to manually resolved	Is automation delivering value?	Auto-remediation ratio increasing
Number of suppressed findings (with quarterly justification review)	Is noise being managed, or are problems being hidden?	Stable or decreasing
New findings introduced compared to resolved this month	Is the organization getting ahead or falling behind?	More finding resolved than introduced
SLA adherence rate by severity	Are response commitments being met?	More than 95% for CRITICAL, and more than 90% for HIGH

Building the dashboard: Use Amazon CloudWatch dashboards for real-time operational visibility or Amazon QuickSight connected to Security Hub findings through Amazon Security Lake for historical trend analysis and executive reporting. The dashboard should be visible to—and regularly viewed by—everyone in the weekly review, not locked in a security team tool.

Quarterly reviews

The quarterly review is a deeper inspection of the system itself; not just the findings, but the machinery processing them.

Quarterly review checklist:

Suppression rules audit: Review every active suppression rule to determine if the underlying condition is still present and the suppression is still justified. Document the review outcome for each rule.
Disabled controls audit: Review every disabled Security Hub control. Check that the justification is still valid and if the environment changed (for example, a service that wasn’t in use is now in use).
Automation audit: Review AWS Identity and Access Management (IAM) roles used by remediation functions and verify least privilege. Review execution logs for any anomalies or failures that weren’t caught.
New capabilities review: Evaluate newly released GuardDuty protection plans and Security Hub controls from that quarter. AWS releases new detection and compliance capabilities regularly. If you’re not reviewing them quarterly, you’re accumulating blind spots.
Process effectiveness review: Determine if the weekly meeting is well-attended and if action items are being completed. Make sure SLAs are being met. If attendance, action item completion, and SLA compliance aren’t where they should be, explore structural changes to address the gaps.

Operational maturity scoring

Use this rubric to assess the maturity of your operational rhythm itself. Score each dimension 1–3 and use the total to track progress over time.

Review cadence	One time reviews when someone remembers	Weekly review happens, but attendance is inconsistent	Weekly review is consistently attended with documented outcomes
Metrics tracking	No metrics captured	Metrics are collected monthly but not acted on	Metrics drive decisions and declining trends trigger specific actions
Finding ownership	Findings sit in queue with no owner	Findings are assigned to teams but SLAs aren’t tracked	Every finding has an owner, SLAs are tracked, and breaches are escalated
Automation management	Set-and-forget automations	Automation logs are reviewed periodically	Automation is reviewed weekly, and new candidates are evaluated continuously
Tuning lifecycle	Suppression rules created but never reviewed	Annual review of suppressions and disabled controls	Quarterly reviews with documented justification for every rule
Cross-team engagement	Security team works in isolation	Platform team participates	Engineering teams actively participate and own remediation

Scoring (revisit quarterly):

Beginning: 6–9
Established: 10–14
Optimized: 15–18

Deliverable: A documented operational cadence with clear ownership (consider a RACI matrix), metrics dashboards, escalation procedures, and a continuous improvement loop. The cadence should survive team member turnover—if it depends on one person remembering to run it, it’s not yet operational.

Phase 5: Mature the architecture

Goal: Fill remaining gaps and build toward a comprehensive security operations capability. Estimated timeline: Ongoing. Prioritize based on organizational risk profile and compliance requirements.

Amazon Inspector integration: Enable Amazon Inspector for Amazon Elastic Compute Cloud (Amazon EC2) instances, Lambda functions, and Amazon Elastic Container Registry (Amazon ECR) container images. Findings flow into Security Hub automatically, adding vulnerability management alongside threat detection and posture management. Prioritize this if you have Amazon EC2 or container workloads without an existing vulnerability scanning solution.
Amazon Macie: Enable Amazon Macie for S3 buckets containing potentially sensitive data. Particularly important for organizations with compliance requirements around personally identifiable information (PII), protected health information (PHI), or Payment Card Industry (PCI) data. Configure automated sensitive data discovery and route findings to Security Hub.
Amazon Security Lake: Amazon Security Lake centralizes security-relevant logs in OCSF format for long-term retention, forensic investigation, and threat hunting. This is valuable when you need historical analysis beyond the Security Hub retention window, or when feeding a third-party Security Information and Event Management (SIEM) tool.
Preventive controls layer: Convert recurring detective findings into preventive policies. Use SCPs to prevent disabling GuardDuty, Security Hub, and CloudTrail, IAM permission boundaries on developer roles, AWS WAF on public endpoints, and AWS Network Firewall for VPC traffic inspection. The pattern is to make recurring misconfigurations impossible to introduce.
Detective controls expansion: Use AWS IAM Access Analyzer for external access and unused access findings, AWS CloudTrail Lake for long-term queryable audit logs, and AWS Config custom rules for organization-specific compliance checks.
Incident response readiness: Have incident response playbooks referencing specific GuardDuty finding types, pre-built forensics infrastructure (isolated VPC, forensic AMIs, and pre-configured IAM roles), regular tabletop exercises, and AWS CloudFormation templates to deploy isolation infrastructure on demand. See the AWS Security Incident Response Guide for a comprehensive framework.

Conclusion

In this post, I provided a six-phase roadmap for operationalizing Security Hub and GuardDuty and showed that it isn’t a single project, but a progression. Phase 0 and Phase 1 can typically be completed in 3–5 weeks and deliver immediate clarity. Phases 2 and 3 build the response infrastructure that turns findings into action over the following 5–7 weeks. Phase 4 is what makes everything sustainable and is where you should invest the most attention. And Phase 5 expands the aperture from Security Hub and GuardDuty into a comprehensive security operations capability.

If you walked away from this post and did one thing, run the Phase 0 assessment this week. That single deliverable tells you exactly where to focus next. Use the following self-assessment checklist to identify your current phase, then focus on the next one. A tuned environment with working notifications and a weekly review cadence is dramatically more effective than a fully featured but neglected deployment. Start where you are, reduce the noise, build the habits, and iterate. To learn more, explore the AWS Security Hub User Guide, the Amazon GuardDuty User Guide, and the AWS Security Incident Response Guide. If you’ve implemented a similar operational cadence, or have questions about any phase, share your experience in the comments.

Self-assessment checklist

Phase 0	We know how many active GuardDuty findings exist across all accounts	☐
	We know our current Security Hub compliance score	☐
	We know whether GuardDuty is enabled in every account and region	☐
	We know who (if anyone) is reviewing findings today	☐
Phase 1	GuardDuty suppression rules exist for known-benign activity	☐
	Irrelevant Security Hub controls have been disabled with documented justification	☐
	All active HIGH and CRITICAL findings have been triaged	☐
	Security Hub compliance scores reflect actual posture, not noise	☐
Phase 2	HIGH and CRITICAL findings generate real-time notifications to the security team	☐
	MEDIUM findings automatically create tracked work items	☐
	Notifications include enriched context (account alias, resource ARN, and console link)	☐
Phase 3	At least three high-confidence finding types trigger automated remediation	☐
	Auto-remediation actions have full audit trails	☐
	Remediation runbooks are documented and version-controlled	☐
Phase 4	A weekly security review meeting occurs with defined attendees and agenda	☐
	MTTA and MTTR are tracked monthly for CRITICAL and HIGH findings	☐
	Suppression rules and disabled controls are reviewed quarterly	☐
	Security metrics trend positively over the past 3 months	☐
Phase 5	Amazon Inspector, Macie, or Security Lake are integrated	☐
	Preventive controls (SCPs, permission boundaries) address recurring findings	☐
	Incident response playbooks exist and are tested through tabletop exercises	☐

If you have feedback about this post, submit comments in the Comments section below.

AWS Security Blog