Skip to main content

Guidance for AI-Generated Images with Stable Diffusion on AWS

Build and scale generative AI applications with high efficiency image generation capabilities

Overview

This Guidance demonstrates how you can integrate Stable Diffusion from Stability AI with Amazon SageMaker to build and scale generative artificial intelligence (AI) applications. It makes it possible for you to decouple interdependent, monolithic generative AI applications that are often restrictive and time-consuming to modify, and implement automatic scaling and expansion for tasks like image inferences and model training. With enhanced platform management functions, such as resource access control, and API support for backend integration, you can use this Guidance to adapt generative AI to the specific needs of your organization. 

How it works

This architecture diagram shows how to use Stable Diffusion APIs to decouple applications into training and inference components that are hosted on Amazon SageMaker.

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

AWS CloudFormation is a service that helps you automate, test, and deploy infrastructure as code templates with continuous integration and continuous delivery (CI/CD) automations. Amazon CloudWatch , where you can use CloudWatch Logs , allows you to monitor, store, and access log files from Lambda and SageMaker ,helping you record requests and visualize the state of your underlying services. Monitoring and storing log files by CloudWatch Logs helps you analyze and troubleshoot requests quickly.

You can use versioning in Lambda to save your function's code and configuration as you develop it. Together with aliases, you can use versioning to perform blue/green and rolling deployments. Additionally, by using CloudFormation , you have production environments with templates, sandbox development capabilities, and test environments for increasing levels of operations control.

Read the Operational Excellence whitepaper 

API Gateway uses a resource policy to control whether a specified principal, typically an AWS Identity and Access Management (IAM) role or group, can invoke the API. All IAM policies are scoped down to the minimum permissions required for Lambda and SageMaker to function properly. By scoping API Gateway resources and IAM policies to the minimum permissions required, you limit unauthorized access to applications and resources.

Read the Security whitepaper 

Lambda runs functions in multiple Availability Zones to ensure that it is available to process events in case of a service interruption in a single zone. Also, Lambda automatically retries an error with delays between retries.

Amazon S3 provides 99.999999999% (11 nines) durability and 99.99% availability of objects over a given year, which can help you store model and data resources with high reliability.

API Gateway sets a limit on a steady-state rate and a burst of request submissions against all APIs in your account. You can configure custom throttling for your APIs. By limiting the number of requests per second or per minute, you can prevent your backend systems from being overwhelmed and maintain the reliability of your API.

Lastly, SageMaker combined with Amazon S3 helps support your data resiliency and backup needs. SageMaker takes care of the underlying infrastructure required for training and deploying machine learning (ML) models, while AWS manages the compute instances, storage, and networking components, ensuring high availability and resilience.

Read the Reliability whitepaper 

Lambda is engineered to provide managed scaling automatically. When your function receives a request while it's processing a previous request, Lambda launches another instance of your function to handle the increased load. As traffic increases, Lambda increases the number of concurrent executions of your functions.

Read the Performance Efficiency whitepaper 

Lambda uses a pay-per-use billing model, where you are billed only for the time your functions are running.

SageMaker manages your ML infrastructure by automatically provisioning and scaling compute resources according to workload requirements.

Read the Cost Optimization whitepaper 

Lambda is a serverless computing service, which means you don't have to provision or manage servers. It automatically scales your code in response to incoming events, and you only pay for the compute time used. This serverless architecture eliminates the need for idle servers, resulting in reduced energy consumption compared to traditional server-based architectures.

With Lambda , you can optimize the utilization of computing resources, allowing you to break down your application into individual functions that can be independently scaled. This fine-grained scaling enables efficient resource allocation, as you only allocate resources to specific functions when they are actively processing requests. It eliminates the need to over-provision resources, leading to better resource utilization and reduced waste.

Read the Sustainability whitepaper 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.