AWS Machine Learning Blog

Category: Amazon SageMaker

RAG -Retrieval Augmented Generation

Build a contextual chatbot for financial services using Amazon SageMaker JumpStart, Llama 2 and Amazon OpenSearch Serverless with Vector Engine

The financial service (FinServ) industry has unique generative AI requirements related to domain-specific data, data security, regulatory controls, and industry compliance standards. In addition, customers are looking for choices to select the most performant and cost-effective machine learning (ML) model and the ability to perform necessary customization (fine-tuning) to fit their business use cases. Amazon […]

Build well-architected IDP solutions with a custom lens – Part 1: Operational excellence

The IDP Well-Architected Lens is intended for all AWS customers who use AWS to run intelligent document processing (IDP) solutions and are searching for guidance on how to build secure, efficient, and reliable IDP solutions on AWS. Building a production-ready solution in the cloud involves a series of trade-offs between resources, time, customer expectation, and […]

Build well-architected IDP solutions with a custom lens – Part 3: Reliability

The IDP Well-Architected Custom Lens is intended for all AWS customers who use AWS to run intelligent document processing (IDP) solutions and are searching for guidance on how to build a secure, efficient, and reliable IDP solution on AWS. Building a production-ready solution in the cloud involves a series of trade-offs between resources, time, customer […]

Build well-architected IDP solutions with a custom lens – Part 4: Performance efficiency

When a customer has a production-ready intelligent document processing (IDP) workload, we often receive requests for a Well-Architected review. To build an enterprise solution, developer resources, cost, time and user-experience have to be balanced to achieve the desired business outcome. The AWS Well-Architected Framework provides a systematic way for organizations to learn operational and architectural […]

Build well-architected IDP solutions with a custom lens – Part 5: Cost optimization

Building a production-ready solution in the cloud involves a series of trade-off between resources, time, customer expectation, and business outcome. The AWS Well-Architected Framework helps you understand the benefits and risks of decisions you make while building workloads on AWS. An intelligent document processing (IDP) project usually combines optical character recognition (OCR) and natural language […]

How Amazon Search M5 saved 30% for LLM training cost by using AWS Trainium

For decades, Amazon has pioneered and innovated machine learning (ML), bringing delightful experiences to its customers. From the earliest days, Amazon has used ML for various use cases such as book recommendations, search, and fraud detection. Similar to the rest of the industry, the advancements of accelerated hardware have allowed Amazon teams to pursue model […]

How Amazon Music uses SageMaker with NVIDIA to optimize ML training and inference performance and cost

In the dynamic world of streaming on Amazon Music, every search for a song, podcast, or playlist holds a story, a mood, or a flood of emotions waiting to be unveiled. These searches serve as a gateway to new discoveries, cherished experiences, and lasting memories. The search bar is not just about finding a song; […]

Machine Learning with MATLAB and Amazon SageMaker

This post is written in collaboration with Brad Duncan, Rachel Johnson and Richard Alcock from MathWorks. MATLAB  is a popular programming tool for a wide range of applications, such as data processing, parallel computing, automation, simulation, machine learning, and artificial intelligence. It’s heavily used in many industries such as automotive, aerospace, communication, and manufacturing. In […]

Text embedding and sentence similarity retrieval at scale with Amazon SageMaker JumpStart

In this post, we demonstrate how to use the SageMaker Python SDK for text embedding and sentence similarity. Sentence similarity involves assessing the likeness between two pieces of text after they are converted into embeddings by the LLM, which is a foundation step for applications like Retrieval Augmented Generation (RAG).

Use Amazon SageMaker Studio to build a RAG question answering solution with Llama 2, LangChain, and Pinecone for fast experimentation

Retrieval Augmented Generation (RAG) allows you to provide a large language model (LLM) with access to data from external knowledge sources such as repositories, databases, and APIs without the need to fine-tune it. When using generative AI for question answering, RAG enables LLMs to answer questions with the most relevant, up-to-date information and optionally cite […]