AWS DevOps & Developer Productivity Blog

Streamline Operational Troubleshooting with Amazon Q Developer CLI

Amazon Q Developer is the most capable generative AI–powered assistant for software development, helping developers perform complex workflows. Amazon Q Developer command-line interface (CLI) combines conversational AI with direct access to AWS services, helping you understand, build, and operate applications more effectively. The Amazon Q Developer CLI executes commands, analyzes outputs, and provides contextual recommendations based on best practices for troubleshooting tools and platforms available on your local machine.

In today’s cloud-native environments, troubleshooting production issues often involves juggling multiple terminal windows, parsing through extensive log files, and navigating numerous AWS console pages. This constant context-switching delays problem resolution and adds cognitive burden to teams managing cloud infrastructure.

In this blog post, you will explore how Amazon Q Developer CLI transforms the troubleshooting experience by streamlining challenging scenarios through conversational interactions.

The Traditional Troubleshooting Experience

When issues arise, engineers typically spend hours manually examining infrastructure configurations, reviewing logs across services, and analyzing error patterns. The process requires switching between multiple interfaces, correlating information from various sources, and deep AWS knowledge. This complex workflow often extends problem resolution from hours into days and increase the burden on the infrastructure teams.

Solution: Amazon Q Developer CLI

Amazon Q Developer CLI streamlines the entire troubleshooting process, from initial investigation to problem resolution, making complex AWS troubleshooting accessible and efficient through simple conversations.

How Amazon Q Developer CLI works:

  • Natural Language Interface: Execute AWS CLI commands and interact with AWS services using conversational prompts
  • Automated Discovery: Map out infrastructure and analyze configurations
  • Intelligent Log Analysis: Parse, correlate, and analyze logs across services
  • Root Cause Identification: Pinpoint issues through AI-powered reasoning
  • Guided Remediation: Implement fixes with minimal human intervention
  • Validation: Test solutions and explain complex issues simply

One of the built-in tools within the Amazon Q Developer CLI, use_aws, enables natural language interaction with AWS services, as shown in Figure 1. This tool leverages the AWS CLI permissions configured on your local machine, allowing secure and authorized access to your AWS resources.

A command line interface showing a list of tools and their permissions. The display is titled "/tools" and shows several built-in tools including execute_bash, fs_read, fs_write, report_issue, and use_aws. Each tool has an associated permission level indicated by asterisks. The use_aws tool is highlighted with "trust read-only commands" permission. At the bottom, there's a note stating "Trusted tools will run without confirmation" and a tip to "Use /tools help to edit permissions".

Figure 1: Tools selection in Amazon Q Developer CLI

Real-World Troubleshooting Scenario

Demonstration Environment Setup

This demonstration was performed with the following environment configuration:

The environment includes a local development machine with necessary tools, appropriate AWS account permissions, and terminal access. By starting Amazon Q Developer CLI in the project directory, it has immediate access to relevant code and configuration files.

Scenario: Troubleshooting NGINX 5XX Errors

The scenario demonstrates troubleshooting a multi-tier application architecture as shown in figure 2 deployed on Amazon ECS Fargate with:

  • Application Load Balancer (ALB) distributing traffic across availability zones
  • NGINX reverse proxy service handling incoming requests
  • Node.js backend service processing business logic
  • Service discovery enabling internal communication
  • CloudWatch Logs providing centralized logging

An AWS cloud architecture diagram showing the flow of traffic from an Internet user through multiple components. The diagram includes: At the top: An Internet user connecting to an Internet Gateway Within a VPC (Virtual Private Cloud): Two public subnets containing a NAT Gateway and Application Load Balancer Two private subnets within an ECS Cluster containing: An NGINX service (Fargate) A Backend service (Fargate) A 10-second timeout between them A Cloud Map Service Discovery component at the bottom CloudWatch Logs integration on the right side The diagram includes a note about gateway timeouts: "504 Gateway Timeout - Backend takes 15s to respond, NGINX timeout is 10s" All components are connected with arrows showing the flow of traffic and data through the system. The infrastructure follows AWS best practices with public and private subnet separation for security.

Figure 2: AWS Architecture diagram for the app used in this blog post

Traditional Troubleshooting Steps

For the architecture in figure 2, when 502 Gateway Timeout errors occur, traditional troubleshooting requires:

  1. Checking ALB target group health
  2. Examining ECS service status across multiple consoles
  3. Analyzing CloudWatch logs from different log groups
  4. Correlating error patterns between services
  5. Reviewing infrastructure code for configuration issues
  6. Implementing and deploying fixes

Amazon Q Developer CLI Approach

Instead, let’s see how Amazon Q Developer CLI handles this systematically, step by step:

Step1: Initial Problem Report

Amazon Q Developer CLI is provided with the initial prompt as a problem statement within the application project directory as shown in the following screenshot in figure 3. Amazon Q Developer responds back and says it is going investigate the 502 Gateway Timeout errors in the NGINX application.

Prompt:

Our production NGINX application is experiencing 502 Gateway Timeout errors. 
I have checked out the application and infrastructure code locally and the AWS CLI 
profile 'demo-profile' is configured with access to the AWS account where the 
infrastructure and application is deployed to. Can you help investigate and diagnose the issue?

A Visual Studio Code window showing a debugging session for an NGINX application. The interface has three main sections: a file explorer on the left showing project files including 'app.ts' and 'nginx-config-task.json', a terminal tab in the center displaying an "Amazon Q" ASCII art logo, and a conversation where a user is reporting 502 Gateway Timeout errors. The terminal shows AWS CLI command execution using a tool called "use_aws" with parameters including the service name "ecs" and region "us-west-2". The interface has red annotations highlighting key areas like "project files", "User provided initial prompt", and "Q CLI executing AWS CLI calls.

Figure 3: Amazon Q Developer CLI with initial prompt and problem statement

Step2: Systematic Infrastructure Discovery

Amazon Q Developer CLI start to systematically discovering the infrastructure as shown in the following screenshot in figure 4. If you see the initial prompt did not include that the app is hosted on ECS, but Amazon Q Developer CLI understood the context and executes the AWS CLI calls to describe the Cluster and the services within it. It made sure that the ECS tasks are running for both the services within the Cluster. It is a key discovery that both services show healthy status (1/1 desired count), indicating the issue isn’t service availability.

A terminal window showing three sequential AWS CLI commands being executed through a "use_aws" tool: First command: "list-clusters" operation for ECS service in us-west-2 region using demo-profile, completing in 1.244 seconds Second command: "list-services" operation targeting the NginxSimulationCluster, completing in 0.877 seconds with confirmation of finding both nginx-service and backend-service Third command: "describe-services" operation examining both services in detail, completing in 0.968 seconds with confirmation that both services are running as expected (1/1 desired count) Each command includes execution details, parameters, and completion status, with the system preparing to check CloudWatch logs next.

Figure 4: AWS Infrastructure discovery by Amazon Q Developer CLI

Step 3: Intelligent Log Analysis

Amazon Q Developer CLI retrieves and analyzes recent CloudWatch logs from the NGINX container, immediately identifying the critical error pattern as shown in the following screenshot in figure 5, where Amazon Q Developer responds: “Perfect! I found the issue. The NGINX logs show clear 504 gateway timeout with upstream timeout messages.”

A terminal window showing two AWS CloudWatch Logs commands being executed: First command: "describe-log-streams" operation for the "/ecs/nginx-service" log group, limiting to 5 most recent entries, ordered by LastEventTime in descending order Second command: "get-log-events" operation retrieving 50 log entries from a specific NGINX container log stream The output reveals a critical error message highlighted at the bottom showing an upstream timeout (error 110) occurring while reading response headers. The error details include client IP 10.0.0.247, upstream server at http://10.0.3.18:3000/, and host 52.35.62.210.

Figure 5: CloudWatch Log analysis by Amazon Q Developer CLI

Step 4: Amazon Q Developer CLI Analysis and Root Cause Identification

Amazon Q Developer examines backend service logs and discovers a mismatch between the backend service response time and NGINX timeout settings, as seen in the following screenshot in figure 6.

A terminal window showing AWS CloudWatch Logs commands and their output. The first command describes log streams for a backend container, and the second retrieves log events. The output reveals a debugging analysis showing that while health checks work fine, regular requests are being delayed by about 15 seconds, causing NGINX timeout issues. The log group is "/ecs/backend-service" in the us-west-2 region using a demo-profile.

Figure 6: Root cause identification by Amazon Q Developer CLI

Step 5: Amazon Q Developer CLI Root Cause Analysis

Amazon Q Developer CLI examines the ECS task definitions to identify the exact configuration mismatch, as shown in the following screenshot in figure 7. Amazon Q Developer finds that:

  • Backend service is configured with response_delay=15000 (15 secs)
  • NGINX proxy is configured with proxy_read_timeout 10s

This mismatch causes 504 gateway timeout errors when the backend response exceeds NGINX’s timeout threshold.

A terminal window showing two AWS CLI commands to describe ECS task definitions in the us-west-2 region. Below the commands is a highlighted "Root Cause Analysis" section that explains a timeout mismatch: the backend service is configured with a 15-second response delay while NGINX has a 10-second proxy timeout, resulting in 502 Gateway Timeout errors. Both commands use a demo-profile and are labeled as checking timeout and response delay configurations.

Figure 7: Root cause analysis and issue detection by Amazon Q Developer CLI

Step 6: Automated Code Fix

Here’s where Amazon Q Developer CLI truly excels—it doesn’t just diagnose; it implements the fix. Since Amazon Q Developer CLI is started within the project where the CDK code for ECS task definition is defined, it identified the code configuration and also modified it, as shown in the following screenshot in figure 8.

A terminal window showing file operations using fs_read and fs_write tools. The code changes show an NGINX configuration update in ecs-nginx-cdk.ts, where the proxy_read_timeout is being modified from '10s' to '20s'. The file also shows additional timeout configurations being added, including proxy_connect_timeout and proxy_send_timeout. The update is confirmed with a user prompt and completed in 0.2 seconds.

Figure 8: CDK code fix by Amazon Q Developer CLI

Step 7: Deployment

Amazon Q Developer CLI builds and deploys the fix by executing cdk synth and cdk deploy using the ‘demo-profile‘ AWS CLI profile that was initially provided in the prompt, as shown in the following screenshot in figure 9.

A terminal window showing two execute_bash commands running in sequence. The first command builds a CDK project using 'npm run build' in the nginx-app directory, completing in 4.102s. The second command deploys the updated CDK stack using 'cdk deploy' with the demo-profile, showing deployment progress including some warnings about minHealthyPercent configurations and CloudFormation stack updates in us-west-2 region.

Figure 9: CDK code build and deployment by Amazon Q Developer CLI

Step 8: Validation

Amazon Q Developer CLI validates the solution by sending a curl request to the ALB endpoint after the successful deployment, as shown in the following screenshot in figure 10.

A terminal window showing the execution of a curl command to test an NGINX application on AWS. The command targets an Elastic Load Balancer in the us-west-2 region. The response shows a successful HTTP 200 OK status after 14 seconds, with a JSON response containing the message "Hello from backend". The test completes in 15.100 seconds, indicating the fix for previous 502 errors was successful.

Figure 10: Fix validation by Amazon Q Developer CLI

In addition to that, Amazon Q Developer also sends a request to the health check endpoint and validates everything is working after the fix was deployed, as shown in the following screenshot in figure 11.

A terminal screenshot showing the results of a health check on an Nginx server using curl. The command executed shows a successful response with "healthy" status, completing in 0.65 seconds. The output displays various metrics including download speed (386 B/s), 100% completion rate, and timing statistics for real, user, and system processes.

Figure 11: Health endpoint validation by Amazon Q Developer CLI

What Amazon Q Developer CLI Accomplished

Using just conversational commands, Amazon Q Developer CLI performed a complete troubleshooting cycle:

  • Infrastructure Discovery: Automatically mapped ECS clusters, services, and dependencies
  • Log Correlation: Analyzed thousands of log entries across multiple services
  • Root Cause Analysis: Identified exact configuration mismatch between NGINX’s timeout (10s) and the backend’s response delay (15s)
  • Code-Level Diagnosis: Located problematic timeout setting in CDK infrastructure code
  • Automated Implementation: Modified infrastructure code to increase the NGINX timeout
  • End-to-End Deployment: Built, deployed, and validated the complete solution
  • Comprehensive Testing: Verified both fix effectiveness and overall system health

Amazon Q Developer CLI handles troubleshooting tasks through a single, conversational interface, eliminating the need for multiple tools or AWS CLI commands.

Conclusion

Amazon Q Developer CLI represents a significant evolution in how we troubleshoot cloud infrastructure issues. By combining natural language understanding with powerful command execution capabilities, it transforms complex troubleshooting workflows into efficient, action-oriented dialogues. Whether you’re dealing with NGINX 5XX errors or similar issues across other AWS services, Amazon Q Developer CLI can help you diagnose issues, implement fixes, and validate solutions—all through a conversational interface that feels natural and intuitive.

Give Amazon Q Developer CLI a try the next time you encounter a troubleshooting challenge, and experience the difference it can make in your operational workflow.

To learn more about Amazon Q Developer’s features and pricing details, visit the Amazon Q Developer product page.

About the Author

kirankumar.jpeg

Kirankumar Chandrashekar is a Generative AI Specialist Solutions Architect at AWS, focusing on Amazon Q Developer. Bringing deep expertise in AWS cloud services, DevOps, modernization, and infrastructure as code, he helps customers accelerate their development cycles and elevate developer productivity through innovative AI-powered solutions. By leveraging Amazon Q Developer, he enables teams to build applications faster, automate routine tasks, and streamline development workflows. Kirankumar is dedicated to enhancing developer efficiency while solving complex customer challenges, and enjoys music, cooking, and traveling.