AWS Compute Blog
Category: Technical How-to
Serverless generative AI architectural patterns – Part 2
This post explores two complementary approaches for non-real-time scenarios: buffered asynchronous processing for time-intensive individual requests, and batch processing for scheduled or event-driven workflows.
Serverless generative AI architectural patterns – Part 1
This two-part series explores the different architectural patterns, best practices, code implementations, and design considerations essential for successfully integrating generative AI solutions into both new and existing applications. In this post, we focus on patterns applicable for architecting real-time generative AI applications.
Under the hood: how AWS Lambda SnapStart optimizes function startup latency
AWS Lambda cold start latency can impact performance for latency-sensitive applications, with function initialization being the primary contributor to startup delays. Lambda SnapStart addresses this challenge by reducing cold start times from several seconds to sub-second performance for Java, Python, and .NET runtimes with minimal code changes. This post explains SnapStart’s underlying mechanisms and provides performance optimization recommendations for applications using this feature.
Effectively building AI agents on AWS Serverless
Imagine an AI assistant that doesn’t just respond to prompts – it reasons through goals, acts, and integrates with real-time systems. This is the promise of agentic AI. According to Gartner, by 2028 over 33% of enterprise applications will embed agentic capabilities – up from less than 1% today. While early generative AI efforts focused […]
Implementing message prioritization with quorum queues on Amazon MQ for RabbitMQ
Quorum queues are now available on Amazon MQ for RabbitMQ from version 3.13. Quorum queues are a replicated First-In, First-Out (FIFO) queue type that uses the Raft consensus algorithm to maintain data consistency. Quorum queues on RabbitMQ version 3.13 lack one key feature compared to classic queues: message prioritization. However, RabbitMQ version 4.0 introduced support […]
Building resilient multi-tenant systems with Amazon SQS fair queues
Today, AWS introduced Amazon Simple Queue Service (Amazon SQS) fair queues, a new feature that mitigates noisy neighbor impact in multi-tenant systems. With fair queues, your applications become more resilient and easier to operate, reducing operational overhead while improving quality of service for your customers. In distributed architectures, message queues have become the backbone of […]
Deploying external boot volumes with AWS Outposts
Building on our previous announcement, AWS Outposts third-party storage integration for data volumes, AWS is expanding its collaboration with third-party storage solutions by introducing support for boot volumes backed by external storage arrays. In this post we show you how to boot Amazon Elastic Compute Cloud (Amazon EC2) instances on Outposts directly from NetApp on-premise […]
Infrastructure as code translation for serverless using AI code assistants
Serverless applications commonly use infrastructure as code (IaC) frameworks to define and manage their cloud resources. Teams choose different IaC tools based on their skills, existing tooling, or compliance needs. As applications grow, the need to shift between IaC formats may arise to adopt new features or align with evolving standards. Developers are rapidly adopting AI-powered […]
Modernizing SOAP applications using Amazon API Gateway and AWS Lambda
This post demonstrates how you can modernize legacy SOAP applications using Amazon API Gateway and AWS Lambda to create bidirectional proxy architectures that enable integration between SOAP and REST systems without disrupting existing business operations. Many organizations today face the challenge of maintaining critical business systems that were built decades ago. These legacy applications power […]
Orchestrating document processing with AWS AppSync Events and Amazon Bedrock
Many organizations implement intelligent document processing pipelines in order to extract meaningful insights from an increasing volume of unstructured content (such as insurance claims, loan applications and more). Traditionally, these pipelines require significant engineering efforts, as the implementation often involves using several machine learning (ML) models and orchestrating complex workflows. As organizations integrate these pipelines […]