- Analytics›
- Amazon SageMaker›
- Amazon SageMaker Catalog
Amazon SageMaker Catalog
Discover, govern, and collaborate on data and AI securely
Overview
Amazon SageMaker Catalog simplifies the discovery, governance, and collaboration for data and AI across your structured and unstructured data, AI models, business intelligence dashboards, and applications. You can securely discover and access approved data and models using semantic search with generative AI-created metadata or just ask Amazon Q Developer with natural language to find your data. Users can consistently define and enforce access policies using a single permission model with fine-grained access controls centrally in the Amazon SageMaker Unified Studio. Seamlessly share and collaborate on data and AI assets through easy publishing and subscribing workflows. Build trust throughout your organization with data quality monitoring, data classification, and end-to-end automated column-level lineage for data and AI assets.
Benefits
Discover your data and AI assets at scale with SageMaker Catalog, built on Amazon Datazone. Enhance data discovery with generative AI to automatically enrich your data and metadata with business context, making it easier for all users to find, understand, and use data. Share your data, AI models, prompts, and generative AI assets with filtering by table and column names or business glossary terms. Automatically recommend valuable columns and relevant analytical applications for each dataset, enabling the use of the right data to quickly build the right models. Support both centralized and decentralized governance models with seamless data and AI sharing through publishing and subscribing workflows in a single experience through Projects.
Features
Data and AI Catalog
Discover, govern, and collaborate on structured data, unstructured data, AI models, BI dashboards, and applications from a single catalog.
Business Glossary
Standardize terminology with shared business definitions and customizable metadata forms. Support restricted classification terms to enforce consistent tagging of sensitive data and enable downstream governance workflows.
Data Lineage
Track how data moves and changes across systems. OpenLineage-compatible lineage helps users understand origins, transformations, and consumption patterns to improve trust, debugging, and governance.
Data Quality Monitoring
View data quality metrics from AWS and third-party tools. Consumers gain trust and context when searching, while data teams can integrate external quality signals through APIs into a unified portal.
Data Discovery
Enrich technical metadata with business context so users can quickly find, understand, and trust the data they use.
Automated Metadata Recommendations
Use LLM-powered automation to generate business-friendly names and descriptions, improving context, consistency, and clarity of technical assets.
Semantic Search
Find data and models using natural language queries. Semantic search understands user intent, context, and relationships - not just keywords - to return more relevant results.
BI Dashboards
Go from data to insights by bringing together data in SageMaker with Amazon Quick Suite capabilities like interactive dashboards, pixel perfect reports and generative business intelligence (BI) - all in a governed and automated manner.
Data Products
Package related assets into business-focused data products with shared metadata. Improve discovery, unify access requests, and reduce administrative overhead while enabling governance teams to track product-level consumption.
Customers
Natera, Inc.
“By integrating Amazon QuickSight with Amazon SageMaker, our lab operations teams and scientists can now monitor clinical test performance across all sites in real time. We’ve developed unified dashboards that consolidate throughput, quality control metrics, and turnaround times, enabling detailed trend analysis and ongoing performance optimization. Scientists can now perform comprehensive data analysis – from exploratory review to model development – all within a single, integrated environment.”
Mirko Buholzer, VP of Software Engineering, Natera, Inc.
Cisco
"You want to discover, share, and govern your data. Whether you call it a data mesh or a data fabric, data exists across different teams in multiple silos, and you need a way to bring it together. Amazon SageMaker Catalog connects data producers and consumers, enabling producers to share data with built-in controls and data contracts while allowing consumers to access the data using the tools of their choice"
Shaja Arul Selvamani, Sr. Director AI/ML, Cisco
NatWest
"Our Data Platform Engineering team has been deploying multiple end-user tools for data engineering, ML, SQL, and gen AI tasks. As we look to simplify processes across the bank, we’ve been looking at streamlining user authentication and data access authorization. Amazon SageMaker delivers a ready-made user experience to help us deploy one single environment across the organization, reducing the time required for our data users to access new tools by around 50%."
Zachery Anderson, CDAO, NatWest Group
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages