Containers

Centralized Amazon ECS task logging with Amazon OpenSearch

As enterprises continue to adopt containerized workloads, the need for robust and scalable logging solutions has become increasingly important. Logging is a crucial element in monitoring and troubleshooting distributed applications, especially in modern containerized environments such as those deployed on Amazon Elastic Container Service (Amazon ECS). As microservices architectures grow in complexity, managing logs across multiple Amazon ECS tasks could face challenges such as log consolidation, performance monitoring, operational complexity, visibility, resource management, and security. With a centralized logging system, you can monitor, search, and analyze logs across different containers and tasks efficiently.

However, centralized logging can be achieved through Amazon CloudWatch, but Amazon OpenSearch is gaining popularity due to its advanced querying capabilities, such as support for more powerful query languages (SQL and PPL – Piped Processing Language), support for complex operations such as JOIN queries across different log groups, and better text search capabilities with full-text search and fuzzy matching.

In this post, we explore how to efficiently collect and manage logs from Amazon ECS tasks and centralize them using Amazon OpenSearch Serverless (AOSS). This approach is especially helpful for users running containerized workloads on Amazon ECS and facing challenges with managing and analyzing their log data. We provide a step-by-step guide on configuring FireLens for Amazon ECS with Fluent Bit to route Amazon ECS task logs to a centralized OpenSearch instance in a separate Amazon Web Services (AWS) account, to provide scalability, security, and cost-efficiency.

The challenges with decentralized logging

Many users run hundreds of workloads on containers using Amazon ECS, each generating its own set of logs. In a decentralized logging setup, logs are often stored locally on individual containers or instances, and this comes with several challenges:

  • Log visibility: Logs are scattered across multiple instances and containers, making it difficult for DevOps and development teams to get a holistic view of system health.
  • Complex troubleshooting: When an issue arises, engineers need to manually SSH into individual instances or containers to pull log files. This makes the debugging process time-consuming and error-prone.
  • Difficult correlation of logs: In a distributed system, a single request might traverse multiple services, with each generating its own logs. Without a centralized logging solution, correlating these logs to trace the request’s journey can be difficult.
  • Inefficient search capabilities: Searching logs across multiple instances or containers necessitates logging into each machine separately. This is inefficient and time-consuming, particularly when dealing with high-traffic applications generating large volumes of logs.
  • Retention and archival: In a decentralized environment, managing log retention and archival policies for each instance becomes complex and may lead to inconsistencies, with some logs being kept for too long or driving up storage costs, and others being prematurely deleted.

The advantages of centralized logging

To address the challenges posed by decentralized logging, users are increasingly adopting centralized logging solutions such as OpenSearch, which offers the following:

  • Log searchability: OpenSearch provides SQL-like query syntax, which makes it easier to filter and analyze logs in real-time, regardless of the volume.
  • Cost-efficiency: Centralizing log storage and enabling better control over log retention allows OpenSearch to help users optimize storage costs while keeping their logs searchable.
  • Real-time analytics and dashboards: OpenSearch Dashboards allow users to build visualizations and monitor log data in real-time, providing deeper insights into the health of their applications.
  • Seamless multi-account logging: Logs from multiple AWS accounts and environments can be centralized into a single OpenSearch domain or OpenSearch Serverless instance, improving visibility and streamlining operations.

This post outlines a practical solution that users can implement to route Amazon ECS container logs to a centralized AOSS instance, enabling effective log management and cost control.

Key considerations for centralized logging to OpenSearch

Centralized logging is an important aspect in Monitoring and Observability under AWS Well-Architected Framework, which provides insight into what’s happening within an application, container runtime, host, or cluster.

OpenSearch is a centralized log ingestion service managed by AWS that lets you run and scale OpenSearch clusters without having to worry about managing, monitoring, and maintaining your infrastructure, or having to build in-depth expertise in operating OpenSearch clusters. Some key use cases include the following:

  1. Microservices monitoring and troubleshooting
    • Scenario: In a microservices architecture, multiple Amazon ECS tasks (containers) run various services that work together to serve an application that could be running across multiple AWS Regions or across multiple AWS accounts. Logs from these tasks provide insights into how the services interact, identify performance bottlenecks, and troubleshoot errors.
    • Benefit: Centralizing logs in OpenSearch allows teams to correlate logs across services, identify the root cause of an issue quickly, and reduce downtime by efficiently navigating through logs of various microservices.
  2. Security and compliance auditing
    • Scenario: Organizations in regulated industries must maintain detailed logs for compliance purposes, including access logs, error logs, and audit trails across all services.
    • Benefit: The fine-grained access control and audit logging features of OpenSearch allow organizations to centralize logs securely, streamlining compliance with industry regulations. Logs can be indexed and queried for specific compliance checks or investigations.
  3. Real-time application monitoring
    • Scenario: DevOps teams need real-time insights into application performance, to make sure that Service Level Agreements (SLAs) are met, and to preemptively detect issues such as high latency, errors, or unexpected load.
    • Benefit: Sending Amazon ECS task logs to OpenSearch allows teams to create real-time dashboards that visualize key metrics, such as request rates, error rates, and response times, enabling them to take immediate action when something goes wrong.
  4. Incident response and forensics
    • Scenario: When a security incident or a failure occurs, it’s crucial to have access to comprehensive logs that can be analyzed to understand what happened and why.
    • Benefit: The powerful search capabilities of OpenSearch allow incident response teams to query and analyze logs from all Amazon ECS tasks involved in the incident, helping to reconstruct events and identify the root cause.
  5. Capacity planning and optimization
    • Scenario: Understanding resource utilization and identifying trends is crucial for optimizing costs and appropriate scaling of ECS clusters.
    • Benefit: Logs centralized in OpenSearch can be used to track resource usage patterns (CPU, memory, I/O) across different Amazon ECS tasks. This data can inform decisions on scaling, resource allocation, and optimizing cloud costs.
  6. Business analytics
    • Scenario: Logs can contain valuable business data, such as user behavior, transaction details, and other events that are critical for business intelligence.
    • Benefit: Centralizing logs into OpenSearch allows business analysts to query logs to extract business metrics, build custom reports, and analyze trends without needing to access the raw data from different sources.
  7. Application lifecycle management
    • Scenario: During the development, testing, and production phases of an application’s lifecycle, logs are crucial for debugging, performance tuning, and maintaining the stability of deployments.
    • Benefit: Developers can use OpenSearch to analyze logs from different stages of the application lifecycle, thus comparing performance between versions or identifying regressions. This provides a smooth transition from development to production.
  8. Automated alerting and notification
    • Scenario: Organizations often need automated systems that notify teams when something goes wrong, such as an increase in error rates, unauthorized access attempts, or service downtimes.
    • Benefit: Integrating OpenSearch with alerting systems (for example Amazon Simple Notification Service (Amazon SNS), Slack, etc.) allows teams to set up automated alerts based on log patterns or thresholds. This also allows for proactive responses to issues before they escalate.

Solution overview

The solution involves:

  • Amazon ECS tasks: These are the containers that run your workload.
  • FireLens sidecar: An AWS log router that is deployed alongside your application containers to route logs.
  • Fluent Bit configuration: Fluent Bit is used for formatting and sending logs to the AOSS instance.
  • AOSS: The centralized log repository where logs from all Amazon ECS tasks are ingested, stored, and analyzed.
  • AWS Identity and Access Management (IAM) roles: Secure cross-account access through IAM roles enables sending logs securely to AOSS.

Key architecture elements:

  • Amazon ECS tasks: Each container in Amazon ECS generates a log as it runs your application code. To effectively collect and centralize these logs, in this guide we are using FireLens to forward logs to an external service, in this case, OpenSearch.
  • FireLensFireLens is a log router for Amazon ECS and AWS Fargate tasks. It works as a sidecar that can collect logs from application containers and route them to a variety of destinations, such as CloudWatch, Amazon S3, or in this case, OpenSearch. For this guide, we are configuring FireLens to use Fluent Bit to route the Amazon ECS container logs to OpenSearch. The logs of the FireLens agent are being routed to CloudWatch, which is needed to troubleshoot any startup issues with the FireLens sidecar container.
  • Fluent Bit configuration: Fluent Bit is a lightweight log processor and forwarder that works with FireLens to send logs to OpenSearch. For this solution, we are using Fluent Bit configuration to:
    • Parse and format log data.
    • Filter logs based on custom rules.
    • Aggregate logs from multiple Amazon ECS tasks before sending them to a remote destination.
  • AOSS: OpenSearch provides indexing, search, and real-time analytics capabilities. OpenSearch has powerful full-text search capabilities and support for complex queries, such as lexical and vector search, which allows it to quickly find and analyze data across log entries. In this guide, we use the built-in OpenSearch Dashboards to monitor and visualize logs in real-time.
  • IAM roles: Secure access to the AOSS instance is managed using IAM roles, so that only authorized Amazon ECS tasks are able to send logs to the OpenSearch domain in a different AWS account. We configure IAM roles with cross-account permissions to allow secure log forwarding.
Diagram illustrating how the logs flow from ECS tasks to Amazon OpenSearch.

Figure 1: Flow of logs from ECS Task to Amazon OpenSearch

Detailed step-by-step solution

In this section, we go through a step-by-step process to configure FireLens with Fluent Bit, set up AOSS, and secure the solution using IAM roles.

Step 1: Configuring FireLens with Fluent Bit

The first step is to configure FireLens to route logs from Amazon ECS tasks to OpenSearch. FireLens is incorporated into the Amazon ECS task definition and runs as a sidecar container.

Amazon ECS task definition example:

 [
  {
    "name": "fluent-bit",
    "image": "amazon/aws-for-fluent-bit:latest",
    "essential": ,
    "firelensConfiguration": {
      "type": "fluentbit",
      "options": {
        "enable-ecs-log-metadata": "true"
      }
    },
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "/ecs/your-log-group",
        "awslogs-region": "your-region",
        "awslogs-stream-prefix": "ecs"
      }
    }
  },
  {
    "name": "app",
    "image": "your-app-image",
    "logConfiguration": {
      "logDriver": "awsfirelens",
      "options": {
        "Name": "es",
        "Host": "your-aoss-endpoint",
        "Port": "443",
        "aws_auth": "On"
      }
    }
  }
]

This configuration routes logs from the application container to OpenSearch using FireLens. The firelensConfiguration and logDriver elements enable the handling of logs from the application container by FireLens and forwarding to the OpenSearch endpoint.

Step 2: Fluent Bit configuration

You must configure Fluent Bit to format and route the logs to OpenSearch. This configuration lives within the task definition resource block and tells Amazon ECS how to set up logging for application containers using FireLens (which uses Fluent Bit under the hood). The following example demonstrates how to configure Fluent Bit within FireLens:

    Name              forward
    Listen            0.0.0.0
    Port              24224

    Name              es
    Match             *
    Host              your-aoss-endpoint
    Port              443
    AWS_Region        your-region
    AWS_Auth          On
    Index             ecs-logs
    Type              _doc
    Replace_Dots      On
    Retry_Limit       False

In this configuration, Host is the AOSS endpoint where logs are sent. – AWS_Auth enables IAM authentication and secure communication between Amazon ECS tasks and the OpenSearch instance. Index is the OpenSearch index where Amazon ECS logs are stored.

Step 3: Set up an AOSS instance in a centralized shared services AWS account

To follow centralize logging architecture and keep costs low, you should create the AOSS instance in a centralized shared services AWS account.The following steps show how you can set up an AOSS instance in a centralized shared services account:

  • Navigate to AOSS in the AWS Management Console.
  • Choose Serverless and create a new domain instance in the shared services AWS account.
  • Configure an index pattern, such as ecs-logs-*, to capture logs from the Amazon ECS tasks across various AWS accounts.
  • To facilitate log ingestion from multiple accounts (such as Amazon ECS workloads running in separate AWS accounts), configure cross-account access by setting up IAM roles with appropriate permissions.
  • Centralizing logging into one shared AOSS instance allows you to not only streamline log storage and search capabilities but also control the OpenSearch storage and search costs so that they remain in one account for ease of management and cost efficiency.

Step 4: Secure log ingestion with IAM roles

To securely route logs from Amazon ECS tasks to AOSS, we need to configure cross-account IAM roles.

  • IAM role in Amazon ECS account: This role allows Amazon ECS tasks to write logs to AOSS in a remote AWS account.

Example IAM policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "es:ESHttpPost",
        "es:ESHttpPut",
        "es:ESHttpPatch"
      ],
      "Resource": "arn:aws:es:your-region:your-aoss-domain/*"
    }
  ]
}
  • IAM role in shared services account: This role grants permission to the Amazon ECS account to ingest logs into the AOSS instance.

Trust relationship:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::ecs-account-id:role/your-ecs-task-role"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Monitoring logs with OpenSearch Dashboards

When logs are ingested into OpenSearch, follow these steps to use OpenSearch Dashboards to visualize and analyze the log data.

Steps to configure OpenSearch Dashboards:

  • Navigate to the OpenSearch Dashboards interface, then browse to ManagementStack ManagementIndex PatternsCreate index pattern
  • Create an index pattern for Amazon ECS logs (for example ecs-logs-*)
Diagram illustrating creation of index pattern in the OpenSearch Dashboards console.

Figure 2: Index pattern creation screen in OpenSearch Dashboards

  • Build custom visualizations, such as line graphs, histograms, or pie charts, to monitor log trends.
  • Set up alerts for specific conditions (for example high error rates) using the OpenSearch alerting feature.

Best practices for centralized logging in Amazon ECS

  • Use structured logging: Structured logs (for example JSON format) are easier to parse, search, and analyze. They also improve the efficiency of log indexing in OpenSearch.
  • Implement log rotation: To prevent high storage costs, set up log rotation policies in OpenSearch to archive or delete old logs.
  • Monitor log ingestion performance: Regularly monitor the performance of Fluent Bit and OpenSearch so that logs are ingested without delays or data loss.
  • Optimize index patterns: Create tailored index patterns in OpenSearch to avoid complexity and improve query performance.

Conclusion

In this post, we demonstrated how to build a centralized logging solution for Amazon ECS workloads using Amazon OpenSearch Serverless (AOSS) and AWS FireLens. This architecture streamlines log aggregation across services and accounts, enhances observability, and reduces the complexity and cost associated with decentralized logging systems.Using a FireLens sidecar container with Fluent Bit allows Amazon ECS task logs to be efficiently streamed to OpenSearch, enabling rich, real-time querying and analysis. The provided Terraform module includes all the necessary resources, such as IAM policies, Fluent Bit configuration, and deployment automation, to help you implement this solution in your environment.

To get started, clone the GitHub repository, review the configuration, and deploy it in your environment. Although AWS endeavors to apply best practices for security within this example, each organization has its own policies. Make sure to use the specific policies of your organization when deploying this solution as a starting point for implementing centralized logging using OpenSearch.

To gain further insights into OpenSearch, refer to the Create your first OpenSearch Dashboard video and Overview of Amazon OpenSearch Ingestion documentation.

Further reading


About the authors

Saurabh Verma is a Sr Cloud Architect at AWS. With extensive knowledge across cloud technologies, he designs and implements scalable, reliable infrastructures. His approach is tailored to address the unique demands of each user, making sure of optimal results that align precisely with the client needs and industry best practices.

Ravindra Agrawal is a DevOps Consultant at AWS Professional Services. His passion for automation shines through as he guides users in embracing DevOps culture, revolutionizing their development processes. Beyond the professional realm, he finds joy in capturing moments through photography, exploring new horizons through travel, and unwinding with a good film.