Amazon SageMaker FAQs
Page topics
General
Open allWhat is the next generation of Amazon SageMaker?
The next generation of SageMaker is a unified platform for data, analytics, and AI. Bringing together widely adopted AWS machine learning (ML) and analytics capabilities, the next generation of SageMaker delivers an integrated experience for analytics and AI with unified access to all your data. SageMaker allows you to collaborate and build faster from a unified studio using familiar AWS services for model development, generative AI, data processing, and SQL analytics, accelerated by Amazon Q Developer, the most capable generative AI assistant for software development. Additionally, you can access all your data whether it’s stored in data lakes, data warehouses, or third-party or federated data sources, with governance built in to address enterprise security needs.
How is the new SageMaker different from what I am using today for my ML workflows?
We expanded the widely adopted SageMaker service with the comprehensive set of AWS data, analytics, and AI capabilities to deliver a unified experience of data, analytics, and AI. Going forward, the existing set of AI/ML capabilities in SageMaker for data wrangling, building, training, and deploying AI models will be referred to as Amazon SageMaker AI. SageMaker AI is integrated within the next generation of SageMaker and is also available as a standalone service for those who wish to focus specifically on building, training, and deploying AI and ML models at scale.
The next generation SageMaker includes:
- Amazon SageMaker Unified Studio: Build in a single development environment to access and use familiar tools and functionality from purpose-built AWS analytics and AI/ML services like Amazon EMR, AWS Glue, Amazon Athena, Amazon Redshift, Amazon Bedrock, and SageMaker AI.
- Amazon SageMaker Data and AI Governance: Securely discover, govern, and collaborate on data and AI with Amazon SageMaker Catalog, built on Amazon DataZone.
Amazon SageMaker is built on an open lakehouse architecture, fully compatible with Apache Iceberg. It unifies all your data across Amazon S3 data lakes, Amazon Redshift data warehouses, third-party and federated data sources.
What capabilities are included with the next generation of SageMaker?
The next generation of SageMaker includes the following capabilities:
- SageMaker Unified Studio: Build with all your data and tools for analytics and AI in a single environment.
- SageMaker Data and AI Governance: Securely discover, govern, and collaborate on data and AI with SageMaker Catalog, built on Amazon DataZone.
- Model development: Build, train, and deploy ML and foundation models (FMs) with fully managed infrastructure, tools, and workflows with SageMaker AI (formerly SageMaker).
- Generative AI app development: Build and scale generative AI applications with Amazon Bedrock.
- SQL analytics: Gain insights with Amazon Redshift, the most price-performant SQL engine.
- Data processing: Analyze, prepare, and integrate data for analytics and AI using open source frameworks on Athena, Amazon EMR, and AWS Glue.
Amazon SageMaker is built on an open lakehouse architecture, fully compatible with Apache Iceberg. It unifies all your data across Amazon S3 data lakes, Amazon Redshift data warehouses, third-party and federated data sources.
Why should I use the next generation of SageMaker?
Bringing together widely adopted AWS ML and analytics capabilities, the next generation of SageMaker delivers an integrated experience for analytics and AI with unified access to all your data. This unified approach helps you work more efficiently with your data, increase collaboration across teams, and enhance overall productivity.
SageMaker allows you to:
- Collaborate and build faster with a single data and AI development environment, using familiar AWS services for model development, generative AI, data processing, and SQL analytics.
- Develop and scale your AI use cases with a broad set of tools to train, customize, and deploy ML and FMs, and rapidly create generative AI applications tailored to your business.
- Reduce data silos with an open lakehouse to unify all your data across Amazon S3 data lakes, Amazon Redshift data warehouses, and third-party or federated data sources.
- Meet your enterprise security needs with built-in data and AI governance to control access to the right data, ML models, generative AI development artifacts, and compute resources, by the right user for the right purpose.
Can I use individual AWS services without using SageMaker?
Yes. You can continue to independently use individual AWS services such as SageMaker AI (formerly SageMaker), Amazon EMR for big data processing, AWS Glue, and Amazon Redshift for data warehousing based on your specific business requirements. There is no impact to how you currently use your individual services today.
SageMaker offers an additional benefit by providing a unified, user-friendly interface that enables access to these services. This approach helps you more effectively innovate with your data, increase collaboration across teams, and enhance overall productivity.
What existing AWS services can I use within SageMaker?
SageMaker brings together a comprehensive set of AWS AI and analytics services across SageMaker Unified Studio, SageMaker Data and AI Governance, and an open lakehouse architecture.
From SageMaker Unified Studio, you can access capabilities for data processing, SQL analytics, ML, and generative AI application development using existing AWS services. For data processing, services like Athena, AWS Glue, Amazon EMR, and Amazon Managed Workflows for Apache Airflow (Amazon MWAA) analyze, prepare, integrate and orchestrate data for analytics and AI at any scale. For SQL Analytics, Amazon Redshift and Athena provide powerful SQL analytic capabilities on your unified data across the lakehouse. ML capabilities are delivered by SageMaker AI (previously known as SageMaker) for building, training, and deploying ML and FMs. Additionally, you can develop generative AI applications using Amazon Bedrock.
SageMaker Data and AI Governance provides end-to-end, built-in governance through a unified data management experience in SageMaker Catalog, built on Amazon DataZone, to securely discover, govern, and collaborate on data and AI.
The SageMaker lakehouse architecture is built on multiple catalog services across AWS Glue Data Catalog, AWS Lake Formation, and Amazon Redshift to provide unified data access across Amazon S3 data lakes, Amazon Redshift data warehouses, and third-party and federated data sources.
In addition, these services remain available as standalone capabilities through the AWS Management Console, giving you flexibility based on your use cases. We will enhance SageMaker with more services in 2025 to unify experiences across analytics and AI. These include search analytics with Amazon OpenSearch Service, business intelligence with Amazon QuickSight, and streaming with the AWS streaming portfolio of services.
How do I get started with SageMaker?
Getting started with SageMaker is easy. The first step is to navigate to the SageMaker Unified Studio management console to create a domain, the organizing entity for connecting together your assets, users, and their projects for your business unit. In the console, choose Create domain, and you will be presented with two options: Quick setup and Manual setup. Choose Quick setup to get started with a set of default configurations that can be customized later. Alternatively, you can choose Manual setup, which gives you full control over your settings as you create your domain. Once your domain is created, you can navigate to the SageMaker Unified Studio (a browser-based web application) where you can use all your data and configured tools for analytics and AI. To learn more about how to get started, visit the SageMaker documentation.
I currently use existing AWS services that are now included in SageMaker. How do I upgrade to the unified experience in SageMaker?
Your existing data development experiences in AWS services like Amazon EMR, AWS Glue, and Athena remain available. This means all existing code and resources you've created can continue to be used without disruption. We will provide easy-to-use upgrade scripts and comprehensive guidelines to bring your existing code base to the unified SageMaker experience in Q1 2025.
For which compliance programs is the next generation of Amazon SageMaker in scope?
Amazon SageMaker Unified Studio and SageMaker Catalog are built on Amazon DataZone (using the same back-end entity store/database, identity and access mechanisms, and APIs) and are therefore included in the scope of all of the same compliance programs as Amazon DataZone. Please refer to the list of Services in Scope by Compliance Program to view the programs for which Amazon DataZone is in scope. This includes SOC, certain ISO certifications, PCI DSS, and HITRUST CSF. Amazon Datazone is also included in the list of HIPAA eligible services.
Product experience
Open allWhat is a project in SageMaker?
A project entity in SageMaker helps users organize their work and provide business context over the jobs they are performing. It provides a collaborative workspace where users can collaborate on data and artifacts such as ML models, notebooks, queries, dashboards, and generative AI applications. Projects are secured so that only users who are explicitly added to the project are able to access the data and tools within it. The project creates AWS Identity and Access Management (IAM) roles based on the project-selected capabilities (for example, a data lake) that provide users with required access to do their job. Projects also provide work isolation within the same account, as well as a security boundary (security group and IAM roles).
How does Amazon Q Developer enhance productivity in SageMaker?
Amazon Q Developer is a generative AI conversational assistant integrated into the SageMaker experience that enhances your productivity throughout the development lifecycle. Through a chat interface, you can use natural language to ask questions about SageMaker, get help with code, and explore resources such as datasets. When you chat with Amazon Q Developer, it uses the context of your current conversation to provide personalized guidance and automated assistance throughout the SageMaker development experience. Amazon Q Developer can help you with code discussions, provide inline code completions, generate SQL queries, find and integrate datasets, and offer intelligent support tailored to your specific development needs.
 By understanding the nuances of your work, Amazon Q Developer delivers targeted, context-aware assistance that streamlines your development process and enhances overall productivity in the SageMaker environment.
What tools are available in SageMaker for analytics and AI jobs?
SageMaker provides a unified, web-based environment that brings together powerful tools for complete data and AI workflows. Built-in IDEs enable AI/ML development, allowing you to process large data volumes from various sources using frameworks and services like PySpark, AWS Glue, and Amazon EMR.
For version control and workflow management, you can commit to Git and define workflows using Amazon MWAA. The integrated SQL query editor allows you to explore, analyze, and visualize data, with the ability to more easily save and share queries and create new datasets.
Model development is streamlined through familiar SageMaker AI tools, including Amazon SageMaker notebooks, JumpStart, HyperPod, MLFlow, Pipelines, and Model Registry. Throughout these processes, Amazon Q Developer is seamlessly integrated across SageMaker tools, providing intelligent assistance in data discovery, preparation, pipeline creation, model building and training, and code deployment.
How do I build generative AI applications in SageMaker?
The Amazon Bedrock IDE, integrated within SageMaker Unified Studio, provides a comprehensive environment for developing generative AI applications. This intuitive interface helps you accelerate application development in a trusted and secure setting, offering access to the high-performing FMs and advanced customization capabilities of Amazon Bedrock.
You can use powerful features such as Amazon Bedrock Knowledge Bases, Guardrails, Agents, and Prompt Flows, allowing your team to rapidly tailor generative AI applications to your specific business needs while adhering to your responsible AI guidelines. SageMaker supports your governed access and enables secure cross-functional collaboration through access-controlled sharing and git-backed auditability.
What types of data sources does SageMaker support?
The lakehouse architecture of SageMaker unifies data across AWS data lakes, data warehouses, third-party applications, and operational databases. It gives you fast, streamlined access to your data in one place through zero-ETL integrations, federated query sources, and 240+ connectors.
How do I ensure that the data in SageMaker is properly governed and secured?
SageMaker provides end-to-end, built-in governance through a unified data management experience in SageMaker Catalog, built on Amazon DataZone. This approach enables you to catalog, discover, access, analyze, and govern both structured and unstructured data assets, ML models, and applications across your organization. SageMaker ensures that the right people have the appropriate access to the right assets, maintaining robust security and compliance standards.
How do I create and manage data pipelines in SageMaker?
You can create and manage data pipelines in SageMaker in multiple ways. Amazon SageMaker Data Processing brings together Amazon EMR, Athena, AWS Glue, and Amazon MWAA to help you integrate, prepare, and explore your data in a unified experience. You can build pipelines for ML-specific model orchestration with SageMaker AI and data pipelines and workflows with Amazon MWAA. You can also use zero-ETL integrations, which simplify data movement by removing complex extract, transform, and load (ETL) processes and enabling direct data replication across services. Visit What is zero-ETL? to learn more.
Pricing
Open allHow does SageMaker pricing work?
When using SageMaker, you will be charged as per the pricing model for the various AWS services accessible through SageMaker. There is no separate cost for using the SageMaker Unified Studio, the data and AI development environment that provides the integrated experience within SageMaker. Visit SageMaker pricing for more information.
Can I try SageMaker for free?
The SageMaker Free Tier helps you quickly get started innovating with data and AI at no cost. Refer to SageMaker pricing for details.
Availability
Open allIn which AWS Regions is SageMaker available?
The next generation of SageMaker is available in the US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and South America (Sao Paulo) AWS Regions. For future updates, see the AWS Regional Services List.
Does SageMaker offer an SLA?
Yes. SageMaker is engineered to provide the consistent performance and uptime that mission-critical analytics and AI workloads demand. As a unified platform comprised of multiple service components, the service availability is tied to the service component used.
For detailed information on the service level agreements (SLAs) for each individual service, refer to its respective SLA documentation. SLAs will provide you with the specific uptime guarantees and reliability commitments for the various services that make up the SageMaker experience.
Available SLA documentation include: