Containers

Deploying Model Context Protocol (MCP) servers on Amazon ECS

Organizations are increasingly adopting AI agents to automate workflows across their operations. Common use cases include customer support, software development, business intelligence, supply chain management, and more. To be useful, these agents need access to internal tools, data sources, and business logic. Custom Model Context Protocol (MCP) servers have become one of the most popular ways to connect AI agents to tools and contexts. As organizations move from prototypes to production, hosting MCP servers on a reliable, scalable, and secure platform becomes critical.

There are several options for hosting MCP servers on AWS, and each has its own strengths. AWS Lambda works well for lightweight, stateless tool endpoints with bursty traffic patterns. Amazon Bedrock AgentCore provides a fully managed agent orchestration suite with built-in identity, memory, and tool discovery. If your workloads need more control over the runtime, networking, and connection lifecycle, then Amazon Elastic Container Service (Amazon ECS) on AWS Fargate is a great option, which we will cover in this post. Amazon ECS lets you run your MCP server as a long-lived service with warm caches, persistent streaming connections, sidecars, and any language or runtime you choose. Amazon ECS integrates naturally with enterprise perimeter controls such as Application Load Balancers, AWS WAF, private subnets, and VPC endpoints. This makes Amazon ECS a strong fit for sessionful or streaming heavy servers, dependency heavy workloads bundling native libraries or large processing pipelines, high-throughput tool gateways that benefit from stable concurrency, and network-embedded servers that must reside inside a VPC alongside private data stores.

In this post, we will walk you through a three-tier MCP application deployed entirely on Amazon ECS, using Service Connect for service-to-service communication and Express Mode for automated load balancing, to show how to take an MCP-based workload from concept to production.

Architecture

We consider a containerized MCP application consisting of three services: a Gradio-based UI, an AI Agent powered by Amazon Bedrock, and a FastMCP server, all running on Amazon ECS with AWS Fargate. Figure 1 illustrates the architecture.

Architecture diagram showing Amazon VPC with public and private subnets, UI Service with ALB and Gradio UI, Agent Service with Strands Agent, and MCP Server on Amazon ECS connecting to Bedrock Nova Lite and S3 Product Catalog

Figure 1: Three-tier MCP application on Amazon ECS

Here are the high-level steps in the request flow:

  1. A user submits a natural language query (for example, “Find laptops under $1000”) through the Gradio UI, exposed to the internet via an Application Load Balancer provisioned by Amazon ECS Express Mode.
  2. The UI forwards the request over Amazon ECS Service Connect to the Agent Service running in a private subnet.
  3. The Agent invokes Amazon Bedrock (Amazon Nova 2 Lite model) to interpret the query and determine which MCP tools to call.
  4. The Agent connects to the MCP Server over Amazon ECS Service Connect using the Streamable HTTP transport. Amazon ECS Service Connect handles the underlying network routing.
  5. The MCP Server executes the tool call like searching, filtering, or retrieving product data from an Amazon Simple Storage Service (Amazon S3) bucket.
  6. The MCP Server returns results through the chain: MCP Server → Agent → UI → User.

All inter-service communication stays within the Amazon Virtual Private Cloud (Amazon VPC). Only the UI service is publicly accessible. The key components of this architecture are:

Component Technology Role
UI Service Gradio on Amazon ECS Web chat interface, public-facing via ALB
Agent Service Strands Agents SDK + Amazon Bedrock Orchestrates AI reasoning and MCP tool calls
MCP Server FastMCP on Amazon ECS Exposes product catalog as MCP tools over Streamable HTTP
Service Connect AWS Cloud Map + Envoy proxy Service-to-service discovery and routing
ECS Express Mode Managed ALB + auto-scaling Automated public endpoint with HTTPS
Product Catalog Amazon S3 Stores product data as JSON
Infrastructure AWS CloudFormation Amazon VPC, subnets, IAM roles, Amazon ECS cluster

The MCP Server uses the Streamable HTTP transport, which operates in stateless mode. Each tool call is a self-contained HTTP request with no server-side session state. This allows the MCP Server to scale horizontally across multiple replicas without session affinity. For workloads that require long-lived sessions, such as multi-step workflows with server-side context, Streamable HTTP also supports a stateful mode using the Mcp-Session-Id header, see the MCP specification for details.

Walkthrough

This walkthrough takes approximately 30–40 minutes to complete. It assumes familiarity with the AWS CLI, Docker, and basic networking concepts.

Prerequisites

Before you begin, make sure you have:

  • An AWS account. If you don’t have one, you can create a new AWS account.
  • AWS Command Line Interface (AWS CLI) v2 version 2.32.0 or later. Run aws --version to verify your installation. For installation instructions, see Installing or updating the AWS CLI.
  • Docker version 20.10 or later with buildx support. Rundocker --version to verify. For installation instructions, see the Docker documentation.
  • Git to clone the sample repository. For installation instructions, see Git downloads.
  • jq for parsing JSON output from the AWS CLI. Run jq --version to verify. For installation instructions, see the jq documentation.
  • Bash shell – macOS/Linux terminal or Windows Subsystem for Linux (WSL) on Windows
  • Amazon Bedrock model access for the Amazon Nova 2 Lite model in your AWS account. To enable access, follow the instructions in Manage access to Amazon Bedrock foundation models.

Step 1: Clone repository and set variables

Start by cloning the sample repository and configuring the environment variables used throughout this walkthrough. Open a terminal and run the following commands:

git clone https://github.com/aws-samples/sample-mcp-server-on-ecs.git

You may need to use GitHub Personal Access Token for the preceding step.

cd sample-mcp-server-on-ecs

# Set your variables (modify these for your environment)
export STACK_NAME=ecs-mcp-blog
export AWS_REGION=us-west-2
export AWS_PROFILE=default

Step 2: Deploy Infrastructure

Deploy all required infrastructure using the provided AWS CloudFormation template. The template provisions the following resources in a single stack:

Run the following command to deploy the stack:

aws cloudformation deploy \
--template-file cloudformation/infrastructure.yaml \
--stack-name $STACK_NAME \
--capabilities CAPABILITY_NAMED_IAM \
--region $AWS_REGION \
--profile $AWS_PROFILE

Check that the stack deployed successfully:

# Should output CREATE_COMPLETE
aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--region $AWS_REGION \
--profile $AWS_PROFILE \
--query 'Stacks[0].StackStatus' \
--output text

The output should read CREATE_COMPLETE.

Step 3: Retrieve stack outputs

Next, run the provided setup script to export all CloudFormation stack outputs as environment variables. These variables are referenced in subsequent steps to deploy and configure the Amazon ECS services.

source scripts/setup-env.sh

You should see output confirming all variables are set. If any value shows empty or you see an error, verify the stack completed successfully in Step 2.

Step 4: Authenticate with Amazon ECR

Authenticate your local Docker client with Amazon ECR:

aws ecr get-login-password --region $AWS_REGION --profile $AWS_PROFILE | \
docker login --username AWS --password-stdin $ECR_REGISTRY

You should see Login Succeeded in the output. If you receive an authorization error, verify that your AWS CLI credentials have ecr:GetAuthorizationToken permission and that the ECR_REGISTRY variable from Step 3 is set correctly.

Step 5: Upload the product catalog

Upload the sample product catalog to the Amazon S3 bucket provisioned by the CloudFormation stack. The MCP Server reads this file at runtime to respond to product search queries.

aws s3 cp sample-data/product-catalog.json s3://$S3_BUCKET/product-catalog.json \
--region $AWS_REGION --profile $AWS_PROFILE

Confirm the upload worked:

# Should return product-catalog.json with size ~5 KiB
aws s3 ls s3://$S3_BUCKET/ --region $AWS_REGION --profile $AWS_PROFILE

Step 6: Build and push Docker images

Build and push container images for all three services to Amazon ECR. The --platform linux/amd64 flag is required for AWS Fargate compatibility. See Pushing a Docker image to Amazon ECR for reference.

# MCP Server
docker buildx build --platform linux/amd64 \
-t $ECR_REGISTRY/${STACK_NAME}-mcp-server:latest \
./mcp-server --push

# Agent
docker buildx build --platform linux/amd64 \
-t $ECR_REGISTRY/${STACK_NAME}-agent:latest \
./agent --push

# UI
docker buildx build --platform linux/amd64 \
-t $ECR_REGISTRY/${STACK_NAME}-ui:latest \
./ui --push

Make sure all images pushed correctly:

# Each should return an imageDigest if empty, the push failed
for repo in mcp-server agent ui; do
echo "--- ${STACK_NAME}-${repo} ---"
aws ecr describe-images \
--repository-name ${STACK_NAME}-${repo} \
--region $AWS_REGION \
--profile $AWS_PROFILE \
--query 'imageDetails[0].[imageTags[0],imageSizeInBytes]' \
--output text
done

Step 7: Configure Amazon ECS Service Connect

Amazon ECS Service Connect lets services communicate by name. It uses AWS Cloud Map for discovery and an Envoy sidecar proxy for routing. Each service needs a JSON configuration that defines its discovery name, port mapping, and log destination. The MCP Server and Agent register as discoverable endpoints so other services can reach them by name (like http://mcp-server:8080). The UI acts as a client, it doesn’t register itself, but uses the Envoy proxy to find the Agent.

For background information on networking approaches, see Networking between Amazon ECS services in a VPC.

Run the following script to generate the Service Connect configuration files for all three services:

./scripts/generate-service-connect-configs.sh

You should see output listing the three generated config files in config/ directory.

Step 8: Deploy Amazon ECS services

Deploy services in dependency order: MCP Server first, then Agent, then UI. This verifies each service can find the ones it depends on at startup. The MCP Server and Agent run as standard Amazon ECS services in private subnets without public access. Amazon ECS Service Connect handles all communication between them. The UI uses Amazon ECS Express Mode, which automatically creates an Application Load Balancer, target group, and auto-scaling policy.

For a complete CLI walkthrough, see Creating an Amazon ECS Linux task for Fargate with the AWS CLI.

MCP server:

aws ecs create-service \
--cluster $CLUSTER_NAME \
--service-name mcp-server-service \
--task-definition ${STACK_NAME}-mcp-server \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[$PRIVATE_SUBNETS],securityGroups=[$MCP_SG],assignPublicIp=DISABLED}" \
--service-connect-configuration file://config/${STACK_NAME}-mcp-server-service-connect.json \
--region $AWS_REGION \
--profile $AWS_PROFILE

Agent:

aws ecs create-service \
--cluster $CLUSTER_NAME \
--service-name agent-service \
--task-definition ${STACK_NAME}-agent \
--desired-count 1 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[$PRIVATE_SUBNETS],securityGroups=[$AGENT_SG],assignPublicIp=DISABLED}" \
--service-connect-configuration file://config/${STACK_NAME}-agent-service-connect.json \
--region $AWS_REGION \
--profile $AWS_PROFILE

Verify both services are running before proceeding:

echo "Waiting 60 seconds for tasks to start..."
sleep 60

aws ecs describe-services \
--cluster $CLUSTER_NAME \
--services mcp-server-service agent-service \
--region $AWS_REGION \
--profile $AWS_PROFILE \
--query 'services[].[serviceName,status,runningCount,desiredCount]' \
--output table

Both services should show runningCount:1. IfrunningCountis 0, check for task failures:

TASK_ARN=$(aws ecs list-tasks --cluster $CLUSTER_NAME --service-name mcp-server-service \
--desired-status STOPPED --region $AWS_REGION --profile $AWS_PROFILE \
--query 'taskArns[0]' --output text)

if [ "$TASK_ARN" != "None" ] && [ -n "$TASK_ARN" ]; then
aws ecs describe-tasks --cluster $CLUSTER_NAME --tasks $TASK_ARN \
--region $AWS_REGION --profile $AWS_PROFILE \
--query 'tasks[0].[stoppedReason,containers[].reason]' --output text
else
echo "No stopped tasks found — services are running normally."
fi

UI service (Express Mode)

The UI service uses Amazon ECS Express Mode for automated load balancing and public access. Express Mode automatically creates Application Load Balancers, target groups, security groups, and auto-scaling policies. See Build production-ready applications without infrastructure complexity using Amazon ECS Express Mode and the API reference.

Create the UI service:

aws ecs create-express-gateway-service \
--cluster $CLUSTER_NAME \
--service-name ui-service \
--execution-role-arn $EXECUTION_ROLE \
--infrastructure-role-arn $INFRA_ROLE \
--primary-container "{
\"image\": \"${UI_ECR}:latest\",
\"containerPort\": 7860,
\"awsLogsConfiguration\": {
\"logGroup\": \"${UI_LOG_GROUP}\",
\"logStreamPrefix\": \"ecs\"
},
\"environment\": [
{\"name\": \"AGENT_ENDPOINT\", \"value\": \"http://agent:3000\"}
]
}" \
--task-role-arn $UI_TASK_ROLE \
--network-configuration subnets=$PUBLIC_SUBNETS,securityGroups=$UI_SG \
--cpu "256" \
--memory "512" \
--scaling-target minTaskCount=1,maxTaskCount=4,autoScalingMetric=AVERAGE_CPU,autoScalingTargetValue=70 \
--tags key=Project,value=ECS-MCP-Blog \
--region $AWS_REGION \
--profile $AWS_PROFILE

Wait for the service to stabilize (around four minutes):

aws ecs wait services-stable \
--cluster $CLUSTER_NAME \
--services ui-service \
--region $AWS_REGION \
--profile $AWS_PROFILE

Then attach Service Connect to the UI service and force a new deployment:

aws ecs update-service \
--cluster $CLUSTER_NAME \
--service ui-service \
--service-connect-configuration file://config/${STACK_NAME}-ui-service-connect.json \
--force-new-deployment \
--region $AWS_REGION \
--profile $AWS_PROFILE

Note: Allow six to seven minutes after this update for Express Mode’s canary deployment to complete before proceeding.

Wait for the deployment to complete before proceeding (around six minutes).

aws ecs wait services-stable \
--cluster $CLUSTER_NAME \
--services ui-service \
--region $AWS_REGION \
--profile $AWS_PROFILE

Step 9: Retrieve the UI endpoint

Confirm all services are healthy, then retrieve the public URL for the UI. For the Boto3 SDK reference on describe_express_gateway_service, see the Boto3 ECS documentation.

# Check service status
aws ecs describe-services \
--cluster $CLUSTER_NAME \
--services mcp-server-service agent-service ui-service \
--region $AWS_REGION \
--profile $AWS_PROFILE \
--query 'services[].[serviceName,status,runningCount,desiredCount]' \
--output table

# Get UI public URL via Express Mode API
UI_SERVICE_ARN=$(aws ecs describe-services --cluster $CLUSTER_NAME --services ui-service \
--region $AWS_REGION --profile $AWS_PROFILE \
--query 'services[0].serviceArn' --output text)

UI_ENDPOINT=$(aws ecs describe-express-gateway-service \
--service-arn $UI_SERVICE_ARN \
--region $AWS_REGION --profile $AWS_PROFILE \
--query 'service.activeConfigurations[0].ingressPaths[0].endpoint' --output text)

echo "UI URL: https://${UI_ENDPOINT}/"

Step 10: Test the application

Open the UI URL in your browser and run a few sample queries to validate end-to-end connectivity:

  • “Show me electronics under $100”
  • “What laptops do you have?”
  • “Find running shoes in stock”

Each query travels the full chain: the UI sends it to the Agent over Amazon ECS Service Connect, the Agent calls Amazon Bedrock to decide which MCP tools to invoke, the MCP Server searches the product catalog in Amazon S3, and the results flow back to the user. Expect a 3 to 5-second response time on the first query as the MCP Server loads the product catalog from S3.If the UI loads but queries return errors, check the Agent and MCP Server logs:

# Agent logs - look for Amazon Bedrock or MCP connection errors
aws logs tail /ecs/${STACK_NAME}/agent --since 10m \
--region $AWS_REGION --profile $AWS_PROFILE

# MCP Server logs - look for Amazon S3 or startup errors
aws logs tail /ecs/${STACK_NAME}/mcp-server --since 10m \
--region $AWS_REGION --profile $AWS_PROFILE

Operational considerations for running MCP servers on Amazon ECS

This section covers operational considerations for running MCP servers on Amazon ECS, focusing on two key areas:

Security and compliance:

AI agents need secure access to AWS resources through MCP, with proper access control and permission enforcement for agent-initiated requests. Today, all AWS services authorize requests using AWS-native Signature Version 4 (SigV4), which requires you to sign requests using a symmetric access key and secret. The Amazon ECS MCP server addresses this challenge through a defense-in-depth security architecture that bridges MCP’s OAuth 2.1 authentication standard with AWS’s native IAM/SigV4 access control model.

This approach provides several security layers:

  • Centralized authentication through AWS Identity and Access Management (AWS IAM)
  • Least-privilege authorization that mirrors calling user permissions
  • Comprehensive audit logs via AWS CloudTrail
  • Sandboxed execution on Amazon ECS with Fargate with minimal task role permissions
  • Input validation for all tool parameters

To further protect your data, use SSL/TLS (minimum TLS 1.2, TLS 1.3 recommended) for all communication with AWS resources, apply AWS encryption solutions across all services, and never embed confidential or sensitive information, such as customer email addresses, in tags or free-form text fields, as this data may appear in billing or diagnostic logs. See Data protection in AWS MCP Server for additional guidance.

Monitoring and Observability:

Maintaining reliability, availability, and performance of your MCP server deployment requires a layered observability strategy. AWS CloudTrail captures all API calls made by or on behalf of your ECS tasks, including Amazon Bedrock InvokeModel requests, Amazon S3 GetObject operations, and AWS Secrets Manager GetSecretValue calls, and delivers log files to an Amazon S3 bucket for audit and forensic analysis. You can identify which users and accounts called AWS, the source IP address, and when each call occurred. Complement this with Amazon CloudWatch Logs using JSON-structured logging to enable advanced querying with CloudWatch Logs Insights. For example, you can track average tool invocation latency by tool name across the MCP server. Enable Container Insights for cluster- and task-level metrics including CPU and memory utilization, network throughput, and task availability, and configure CloudWatch alarms to alert proactively when thresholds are breached. For distributed tracing across the three-tier architecture, add the AWS X-Ray daemon as a sidecar container and instrument application code using the X-Ray SDK to trace requests end-to-end from the Gradio UI through the Agent to the MCP server. Finally, Amazon ECS Service Connect automatically publishes Envoy proxy metrics, including ActiveConnectionCount, NewConnectionCount, ProcessedBytes, and TargetResponseTime to Amazon CloudWatch, giving you visibility into inter-service communication health and latency distribution across the deployment.

Cleanup

This deployment creates billable AWS resources. To avoid ongoing charges, run the provided cleanup script, which removes all resources in the correct dependency order: Amazon ECS services, Express Mode resources, Amazon S3 buckets, Amazon ECR repositories, the AWS CloudFormation stack, and retained CloudWatch log groups.

# Verify your deployment variables are set
export STACK_NAME=ecs-mcp-blog
export AWS_REGION=us-west-2
export AWS_PROFILE=default

# Run cleanup
./scripts/cleanup.sh

The script will:

  1. Delete all Amazon ECS services in parallel
  2. Wait for services to drain and stop
  3. Remove the Express Mode Application Load Balancer and orphaned security groups
  4. Empty Amazon S3 buckets including versioned objects
  5. Delete Amazon ECR repositories and all container images
  6. Delete the AWS CloudFormation stack
  7. Remove retained Amazon CloudWatchlog groups

If any step fails, the script continues and reports failures at the end with links to the AWS Management Console for manual remediation.

If the CloudFormation stack deletion fails on the VPC resource because of lingering elastic network interfaces or other dependencies, open the Amazon VPC console, select the VPC tagged with your stack name (ecs-mcp-blog), and choose Actions > Delete VPC to remove the VPC and its associated subnets, route tables, internet gateways, and network address translation (NAT) gateways. See Delete your VPC for troubleshooting guidance.

After the script completes, verify that the CloudFormation stack returns DELETE_COMPLETE, the S3 bucket returns a NoSuchBucket error, and the ECR repository query returns an empty list.

Conclusion

Amazon ECS with AWS Fargate gives you the operational control, networking flexibility, and connection management you need to run MCP servers in production. By combining ECS Service Connect for service-to-service communication, Express Mode for automated load balancing and public endpoints, and a defense-in-depth security architecture that bridges MCP’s OAuth 2.1 standard with AWS’s native IAM/SigV4 model, you can build AI agent infrastructure that is secure, observable, and ready for regulated industries and multi-tenant environments.

AWS is expanding its catalog of pre-built, production-ready MCP servers, including the Amazon ECS MCP server, which provides secure, least-privilege access to AWS services through standardized tools, and a growing library of community and AWS-managed servers that accelerate integration across databases, APIs, and enterprise data sources. This ecosystem means teams can move faster by composing agents from pre-built building blocks rather than building every integration from scratch.

Ready to try it? Clone the sample repository and adapt the MCP server to your own tools and data sources. The patterns you’ve seen here will work for most AI agent architectures on ECS. For additional guidance, see the Amazon ECS Best Practices Guide and the AWS MCP Server documentation.


About the authors

Piyush Mattoo

Piyush is a Solutions Architect at Amazon Web Services, where he specializes in enterprise-scale distributed systems and modern software architecture. He works closely with customers to design and implement cloud-native solutions using containers and AWS services.

Sudheer Manubolu

Sudheer is a Solutions Architect at Amazon Web Services, where he helps organizations modernize and scale their infrastructure on AWS. With a focus on cloud-native and containerized architectures, AI integrations, and developer platforms, he works with customers to turn emerging technologies into practical, production-ready solutions.

Stacey Hou

Stacey is a Senior Product Manager – Technical at AWS, where she focuses on generative AI and observability at Amazon Elastic Container Service (ECS). She works closely with customers and engineering teams to drive innovations that improve the experience of building, operating, and troubleshooting containerized applications at AWS.