Artificial Intelligence

Category: Management Tools

Implement a secure MLOps platform based on Terraform and GitHub

Machine learning operations (MLOps) is the combination of people, processes, and technology to productionize ML use cases efficiently. To achieve this, enterprise customers must develop MLOps platforms to support reproducibility, robustness, and end-to-end observability of the ML use case’s lifecycle. Those platforms are based on a multi-account setup by adopting strict security constraints, development best […]

CloudWatch dashboard

Monitor Amazon Bedrock batch inference using Amazon CloudWatch metrics

In this post, we explore how to monitor and manage Amazon Bedrock batch inference jobs using Amazon CloudWatch metrics, alarms, and dashboards to optimize performance, cost, and operational efficiency.

User invitation and authentication process diagram integrating AWS WAF, Amazon Cognito, Amazon CloudWatch, and SageMaker Ground Truth

Create a private workforce on Amazon SageMaker Ground Truth with the AWS CDK

In this post, we present a complete solution for programmatically creating private workforces on Amazon SageMaker AI using the AWS Cloud Development Kit (AWS CDK), including the setup of a dedicated, fully configured Amazon Cognito user pool.

cluster dashboard

Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod

With a one-click installation of the Amazon Elastic Kubernetes Service (Amazon EKS) add-on for SageMaker HyperPod observability, you can consolidate health and performance data from NVIDIA DCGM, instance-level Kubernetes node exporters, Elastic Fabric Adapter (EFA), integrated file systems, Kubernetes APIs, Kueue, and SageMaker HyperPod task operators. In this post, we walk you through installing and using the unified dashboards of the out-of-the-box observability feature in SageMaker HyperPod. We cover the one-click installation from the Amazon SageMaker AI console, navigating the dashboard and metrics it consolidates, and advanced topics such as setting up custom alerts.

Advancing AI agent governance with Boomi and AWS: A unified approach to observability and compliance

In this post, we share how Boomi partnered with AWS to help enterprises accelerate and scale AI adoption with confidence using Agent Control Tower.

Get faster and actionable AWS Trusted Advisor insights to make data-driven decisions using Amazon Q Business

In this post, we show how to create an application using Amazon Q Business with Jira integration that used a dataset containing a Trusted Advisor detailed report. This solution demonstrates how to use new generative AI services like Amazon Q Business to get data insights faster and make them actionable.

Build a FinOps agent using Amazon Bedrock with multi-agent capability and Amazon Nova as the foundation model

Build a FinOps agent using Amazon Bedrock with multi-agent capability and Amazon Nova as the foundation model

In this post, we use the multi-agent feature of Amazon Bedrock to demonstrate a powerful and innovative approach to AWS cost management. By using the advanced capabilities of Amazon Nova FMs, we’ve developed a solution that showcases how AI-driven agents can revolutionize the way organizations analyze, optimize, and manage their AWS costs.

Enable Amazon Bedrock cross-Region inference in multi-account environments

In this post, we explore how to modify your Regional access controls to specifically allow Amazon Bedrock cross-Region inference while maintaining broader Regional restrictions for other AWS services. We provide practical examples for both SCP modifications and AWS Control Tower implementations.

Innovating at speed: BMW’s generative AI solution for cloud incident analysis

In this post, we explain how BMW uses generative AI to speed up the root cause analysis of incidents in complex and distributed systems in the cloud such as BMW’s Connected Vehicle backend serving 23 million vehicles. Read on to learn how the solution, collaboratively pioneered by AWS and BMW, uses Amazon Bedrock Agents and Amazon CloudWatch logs and metrics to find root causes quicker. This post is intended for cloud solution architects and developers interested in speeding up their incident workflows.

Terraform-troubleshooting

Accelerate IaC troubleshooting with Amazon Bedrock Agents

This post demonstrates how Amazon Bedrock Agents, combined with action groups and generative AI models, streamlines and accelerates the resolution of Terraform errors while maintaining compliance with environment security and operational guidelines.