Listing Thumbnail

    IBM Cloud Pak for Data

     Info
    Deployed on AWS
    IBM Cloud Pak® for Data is a unified data and AI platform that runs on any cloud. Utilize a data fabric to automatically break down data siloes, improve data quality and enhance data privacy and security. Build and infuse trustworthy AI across your business to drive digital transformation.
    4.2

    Overview

    For more information or customized pricing, please email us: cpd_on_aws@wwpdl.vnet.ibm.com 

    IBM Cloud Pak for Data is a unified data and AI platform that connects the right data, at the right time, to the right people anywhere. Available on AWS and running on Red Hat OpenShift, the platform simplifies data access, automates data discovery and curation, and safeguards sensitive information by automating policy enforcement for all users in your organization. Make better data driven decisions and lay the foundation for AI with a data fabric that connects siloed data on premises or across multiple clouds without data movement. Discover actionable insights and apply trusted data to build, run, automate and manage AI models.

    Outcomes:

    • Data access and availability: Eliminate data silos and simplify your data landscape to enable faster, cost-effective extraction of value from your data.
    • Data quality and governance: Apply governance solutions and methodologies to deliver trusted, business data.
    • Data privacy and security: Fully understand and manage sensitive data with a pervasive privacy framework.
    • Batch data integration: Design, develop and run jobs that move and transform data with powerful automated integration capabilities.
    • 360 entity data: Enable agility and accelerated ROI for consolidated and governed views of critical enterprise data.

    Product Version 4.7.x

    Standard Min: 48 VPCs Enterprise Min: 72 VPCs

    Already have a CP4D License? Deploy from the BYOL Listing today!

    Highlights

    • Deliver data responsibly with a data fabric. Unify and access disparate data with AutoSQL, a universal query engine. Discover and classify data in real time with Watson Knowledge Catalog. Protect sensitive data with automated policy enforcement.
    • Scale trustworthy AI: Synchronize application and model pipelines while reducing drift, bias, and risk with ModelOps on Watson Studio. Monitor and govern AI models to meet regulations, manage risk and enhance transparency.
    • Recognized by analysts as a Leader in core data and AI segments: The Forrester Wave™: Machine Learning Data Catalogs, Q4 2020; 2021 Gartner Magic Quadrant for Data Science and Machine Learning; The Forrester Wave™: Multi modal Predictive Analytics and Machine Learning, Q3 2020.

    Details

    Delivery method

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Buyer guide

    Gain valuable insights from real users who purchased this product, powered by PeerSpot.
    Buyer guide

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    IBM Cloud Pak for Data

     Info
    Pricing is based on the duration and terms of your contract with the vendor. This entitles you to a specified quantity of use for the contract duration. If you choose not to renew or replace your contract before it ends, access to these entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    1-month contract (2)

     Info
    Dimension
    Description
    Cost/month
    Standard Option
    Cloud Pak for Data Standard Option: 48 VPCs
    $19,824.00
    Enterprise Option
    Cloud Pak for Data Enterprise Option: 72 VPCs
    $59,400.00

    Vendor refund policy

    Please contact your rep for any questions.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Software as a Service (SaaS)

    SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

    Support

    Vendor support

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    50
    In Data Preparation
    Top
    10
    In Data Catalogs, Data Governance, Master Data Management
    Top
    10
    In Data Catalogs, Data Governance

    Customer reviews

     Info
    Sentiment is AI generated from actual customer reviews on AWS and G2
    Reviews
    Functionality
    Ease of use
    Customer service
    Cost effectiveness
    Positive reviews
    Mixed reviews
    Negative reviews

    Overview

     Info
    AI generated from product descriptions
    Data Fabric Architecture
    Unified platform that connects data across multiple clouds and on-premises environments without data movement
    Universal Query Engine
    AutoSQL technology enabling comprehensive data querying across disparate data sources
    Data Governance Framework
    Automated policy enforcement and real-time data discovery and classification capabilities
    AI Model Management
    ModelOps system for synchronizing application and model pipelines with drift and bias reduction
    Data Integration Capabilities
    Powerful automated integration tools for designing, developing, and running data transformation and movement jobs
    Metadata Management
    Centralized metadata aggregation from multiple disparate data sources with unified platform capabilities
    Behavioral Analysis
    Advanced engine for analyzing data usage patterns, interactions, and metadata insights
    AI Governance Framework
    Comprehensive system for tracking data lineage, ensuring data quality, transparency, and compliance for AI models
    Search and Discovery
    Automated platform enabling comprehensive search, description, and understanding of data assets including reports and models
    Architectural Extensibility
    Open and flexible architecture supporting integration across different data environments and platforms
    AI Governance
    Provides active metadata-driven governance framework for AI strategies with rules, processes, and responsibilities to mitigate risks and ensure ethical AI practices
    Data Catalog Management
    Enables comprehensive discovery and understanding of data assets across hybrid and multi-cloud environments with full business context and metadata insights
    Automated Data Lineage
    Offers end-to-end lineage tracking with complete transparency into data transformation and flow across systems, including summary-level and technical lineage details
    Privacy Workflow Automation
    Centralizes and automates privacy workflows to address global regulatory requirements and encourage collaborative data protection
    Data Quality Management
    Replaces manual processes with automated data monitoring and rule management to scale data quality across enterprise environments

    Contract

     Info
    Standard contract
    No
    No
    No

    Customer reviews

    Ratings and reviews

     Info
    4.2
    93 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    25%
    54%
    18%
    3%
    0%
    1 AWS reviews
    |
    92 external reviews
    External reviews are from G2  and PeerSpot .
    ArchanaSingh

    Collaborative data platform has transformed analytics and now drives faster decisions

    Reviewed on Jan 26, 2026
    Review from a verified AWS customer

    What is our primary use case?

    IBM Cloud Pak for Data  is a powerful cloud-native, all-in-one, easy-to-use solution that enables us to put data to work quickly and effectively. This tool enables us to approach analytics our way with code, low-code, and no-code options that allow us to collaborate on one platform easily.

    It is easy to transform structured and unstructured data into analytics insights, where we use those insights to make data-driven decisions easily. We build and test models with best-in-class AI analytics.

    As a team, we connect IBM Cloud Pak for Data  with our product for integration with cloud, and it enables all of our data users to collaborate from a single, unified interface that supports many services that are designed to work together. This helps to give our users recommendations as they want to store the data in the cloud.

    Regarding my main use case, it is a very great tool that enables our users to collaborate by using single data that is stored in a centralized, unified place where they can access it at any time. It also helps to drive business productivity by reducing the time spent reading and analyzing data. We use this tool in all departments that need to gather relevant information in the cloud from a single centralized platform for better reporting of data. It is possible to analyze data from many sources in a short period of time.

    What is most valuable?

    The best features IBM Cloud Pak for Data offers include robust data visualization, centralized data analytics, data reliability, and compatibility with hybrid and multi-cloud environments.

    The compatibility with hybrid and multi-cloud environments has helped our organization as data visualization is very simple. Migrations, reading, analysis, and data management from other sources are performed without problems of requirements. We have a team of experts in IBM Cloud Pak for Data to maintain security and correct data management easily.

    I find this cloud excellent for visualizing and managing data across networks and also fulfilling fastest data storage, making it less complex and completely improving productivity in my organization. Everything is managed in multiple environments without any problem.

    IBM Cloud Pak for Data has positively impacted my organization, and I have noticed some improvement since we started using this tool. Configuration with hybrid and multi-cloud environments has been very seamless and easy. It is a robust platform capable of working with multiple data sources where we gain insights to make data-driven decisions easily. It automates data analysis for quick and better performance. We have seen improvements in analysis and data correction from multiple sources. Our productivity in the company is growing, thanks to the data analysis team. We have also seen a robust hybrid and multi-cloud access system working seamlessly.

    I can share specific outcomes that show how productivity has grown and how performance has improved since the data is automated, and the analysis is done much faster, saving us a lot of time. We have been able to save approximately 80 percent of our time. We are not doing data analysis manually, so this relieves our data department of dealing with data. We have been relieved of a lot of duties, and now we are able to focus on other strategic tasks. Our productivity has greatly increased since we are able to make concrete and data-driven decisions easily.

    What needs improvement?

    Setting up the hybrid and multi-cloud environments is a long job and it takes time.

    Additionally, the customer support should be more responsive and reach and respond on time.

    The two main challenges that I face are setup complexity and customer support responsiveness. Customer support needs some improvement, as they are not always unresponsive, but sometimes they are not quick to respond to our queries. They should improve on that.

    For how long have I used the solution?

    I have been using IBM Cloud Pak for Data for the past five years.

    What do I think about the stability of the solution?

    IBM Cloud Pak for Data is stable.

    What do I think about the scalability of the solution?

    IBM Cloud Pak for Data's scalability has been great since we started using this platform. I have not noticed any downtime or lagging, especially when dealing with large data, so it is relatively very scalable.

    How are customer service and support?

    Customer support should be more responsive and reach and respond on time.

    Customer support needs some improvement, as they are not always unresponsive, but sometimes they are not quick to respond to our queries. They should improve on that.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    We switched from using Azure Databricks , and the reason why we switched is that IBM Cloud Pak for Data has been very helpful and innovative because it increased our workflow and collaboration using an integrated multi-cloud platform. It also enables us to deploy in any flexible way, on-premises or cloud, which saves time and hard disk space.

    How was the initial setup?

    Setting up the hybrid and multi-cloud environments is a long job and it takes time.

    The two main challenges that I face are setup complexity and customer support responsiveness.

    Regarding my experience with pricing, setup cost, and licensing, for a small organization, the price might be relatively high, but for huge enterprises such as ours, the price is relatively affordable. The setup is easy, with no complexity, and no time wasting.

    What was our ROI?

    I have seen a return on investment, and I can share that we save a lot of time by predicting outcomes faster using the platform built with data fabric architecture. We save at least 70 percent of our time. It has been easy to collect, organize, and analyze data no matter where it is from multiple sources. The manual catalog is eliminated to save costs by 50 to 60 percent. We have been able to drive responsible, transparent, and explainable AI workflow to operationalize AI and mitigate risk and regulatory compliance easily.

    What's my experience with pricing, setup cost, and licensing?

    Regarding my experience with pricing, setup cost, and licensing, for a small organization, the price might be relatively high, but for huge enterprises such as ours, the price is relatively affordable.

    Which other solutions did I evaluate?

    Before choosing IBM Cloud Pak for Data, I evaluated other options, including Cloudera Data Platform .

    What other advice do I have?

    I would advise others looking into using IBM Cloud Pak for Data that it is a very great tool that is an all-in-one, real-time data analytics solution that provides a phenomenal user experience. It increases data transparency, saves a lot of time, and it saves cost as well. It is a great tool that transforms and analyzes data regardless of the area, making it a sure-bet tool. I would rate this product a nine out of ten.

    reviewer2648136

    Longstanding reporting platform has supported reliable dashboards and regulatory compliance

    Reviewed on Dec 19, 2025
    Review provided by PeerSpot

    What is our primary use case?

    The main use case for IBM Cognos  is for business intelligence and reporting.

    What is most valuable?

    IBM Cognos  has been available for many years, and we use regular dashboarding and for producing scheduled reports and some mandatory regulatory reporting. All our departments use Cognos, and the actual Cognos reports are developed by those different teams.

    IBM Cognos is very stable and has been around for many years, with many users familiar with it, making it a reliable solution for our institution. Because of our long association with Cognos, we have good pricing.

    The benefits of choosing IBM Cognos, in addition to saving on cost, include having institutional knowledge about maintaining this infrastructure and enough people who have developed on Cognos in the past, which creates comfort in its use. Cognos is a reliable solution, and developer productivity is high because of the long history of development on it.

    What needs improvement?

    I do not know if Cognos has all the features that users are looking for since we provide it as our standard and do not maintain infrastructure for other tools.

    For how long have I used the solution?

    I am actually very new to the organization and have been here for less than a year.

    How are customer service and support?

    I would rate IBM's support at about a seven or eight out of ten because we have good support coverage owing to our long association with IBM. We are good on the support front. IBM support is very supportive, and I would rate them an eight out of ten based on our long relationship with them.

    How would you rate customer service and support?

    Positive

    How was the initial setup?

    DataStage is not difficult to set up, but we had a lot of challenges in setting up IBM Cloud Pak for Data  cluster on-premises. Our infrastructure team faced many challenges when they were doing it because we had to first stand up an OpenShift cluster on-premises before deploying IBM Cloud Pak for Data  solution.

    The setup for IBM Cloud Pak for Data is very complex, and our teams responsible for standing up the environment struggled a lot. This might also be due to the learning curve since we had not used containerized solutions in the past.

    What's my experience with pricing, setup cost, and licensing?

    The pricing and setup cost are handled by a different procurement team. Our IT procurement team is centralized, so licensing and the actual cost of the software are taken care of by a different team altogether.

    Which other solutions did I evaluate?

    I am not sure about the main differences between IBM Cognos and some other business intelligence tools such as Tableau or Microsoft because many members of the user community have previously experienced those reporting tools before joining our college. However, due to the variety of cloud offerings, users are often able to subscribe directly without having to approach IT for reporting tools, given they have the budget.

    What other advice do I have?

    I do not utilize Dell PowerStore  or Dremio  because I work for a university setting with a very simple infrastructure, where we just use Cognos and IBM DataStage.

    I do not know if my organization uses AWS  as a main cloud provider. We are not on the cloud in a major way and are still on-premises for most of our solutions. In fact, even IBM DataStage, we are using IBM Cloud Pak for Data version, but it is installed on-premises, and we haven't progressed much on how to migrate to the cloud yet.

    I am not sure if we use AWS  as a cloud provider since we do have some SaaS applications that we subscribe to, but I do not know where they are hosted. I just know we have access to the application for the user interface, and the data is pulled out using an API, but we do not know where it is hosted.

    I do not utilize Cognos ad hoc reporting because I do not develop reports. We only host the Cognos infrastructure for our different user groups, and the report development is completed by them. Our infrastructure team provides the hardware, and our system engineering team provides the installation and application maintenance for Cognos.

    I think some users are using the interactive dashboards feature, and there are also other tools such as Power BI and Tableau that some users automatically use. However, our IT organization only provides Cognos as an enterprise business intelligence and reporting tool. Other tools are subscribed to separately by different people.

    I am not the right person to speak on the machine learning capabilities, as my responsibility is to work with different IT teams who maintain systems across the university. I connect to them using IBM DataStage to fetch their data, perform ETL activities, and load the data into an Oracle database. My team maintains the infrastructure for DataStage and Cognos, but actual development is done by other people.

    I use IBM DataStage, which we call IBM Cloud Pak for Data, as we migrated from InfoSphere DataStage to IBM Cloud Pak for Data, and it is installed on-premises in our data center. IBM Cloud Pak for Data version is more or less a modern OpenShift cluster-based platform.

    The best features of IBM Cloud Pak for Data include a very modern approach to providing data capabilities under one umbrella, with various services such as artificial intelligence and machine learning capabilities, real-time integration, and data virtualization, though each has separate licenses associated with them. We are currently only using the DataStage license.

    We have not evaluated data virtualization, but I recognize it as a good capability for exploring and experimenting with data, especially for those unfamiliar with data modeling. However, we are not using it due to cost considerations.

    The developer productivity for DataStage on IBM Cloud Pak for Data is the same as on the old tool, InfoSphere. It does not change anything because the core capabilities remain consistent.

    Overall, I would rate Cognos a nine out of ten from a pure infrastructure stability and support perspective because we are comfortable and know what to do, considering the long-term use of Cognos.

    Overall, I would rate IBM Cloud Pak for Data a nine out of ten in terms of capabilities. It mirrors the traditional InfoSphere version of DataStage with a good ETL tool that covers all features expected from such tools.

    We did not purchase through a marketplace such as AWS. This is all from a long association with IBM directly through negotiations with our procurement team, as we have been a large IBM customer for many years. I would rate this review a nine out of ten overall.

    Michelle Leslie

    Starts strong with data management capabilities but needs a demo database

    Reviewed on Jan 15, 2025
    Review provided by PeerSpot

    What is our primary use case?

    My primary use case for Cloud Pak is that I am the reference Data steward for the Africa regions in the banks where I work. My main objective is to capture the reference data in Caltech or Data and ensure that people profile or QA their data.

    This is due to the fact that a large percentage of data is actually reference data, not by volume, but by the number of tables. The group-approved reference data is used to assure quality and ensure people know what they have; that's my primary use case for Cloud Pak.

    What is most valuable?

    There's a whole bunch of stuff I really like. I love the way that I can start at a very basic level with my data management journey by capturing my policies, justifying my data, and putting them into different categories to say this is data relating to individuals, for example, or data relating to geography. Those base-level data management components, together with the reference data, can then be reused whether I want to figure out where the data is coming from—using Nantucket, for example—or checking the quality of my data.

    Often, when I check the quality of my data, I might find an issue, but that data did not originate in the system where I found the issue. So, I need to use Nantucket to track back to where that data originally came from so I can fix it at the source. I love that component of Cloud Pak.

    I do not do much with the machine learning or AI pieces. It is probably because I can start at a basic level with data management: policies, rules, categories, reference data, and business terms. From there, I can work my way into a more granular level, applying all of that information on top of my actual data to understand what my data looks like, where it came from, and where it went wrong, managing it throughout the cycle.

    What needs improvement?

    What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated.

    There are so many components to data management, and more often than not, people understand one thing really well. They may understand DataStage and how to move data around, but they do not see the impact of moving data incorrectly.

    They also do not see the impact of everyone understanding a piece of data in the same way. I would love Cloud Pak to come with a demo database that illustrates the different components of data management in a logical way, so I can see the whole picture instead of just the area I'm specializing in.

    It would be great if Cloud Pak, from a data modeling point of view, allowed us to import our PDMs, for example. It would be ideal to import and create business terms in Cloud Pak. The PEA would be great to create the technical data. The association between the business and the technical metadata could then be automated by pulling it through from your ACE  models. The data modeling component is available in Cloud Pak.

    Additionally, when it comes to Cloud Pak, even though it has the NextGen DataStage built into it, there is Cloud Pak for data integration as well. Currently, I do not think we have a full enough understanding of how CP4D and CP4I can enhance each other.

    For how long have I used the solution?

    I have used the solution since the end of 2021.

    What do I think about the scalability of the solution?

    Scalability is endless if I can pay for it. Obviously, it is just for containers, however, I have to pay more.

    How are customer service and support?

    The response time is quick, however, solving the problem is not always as fast. Cloud Pak is a complicated system, and it's often difficult to find the right resource in IBM to help with specific issues.

    How would you rate customer service and support?

    Neutral

    How was the initial setup?

    The setup was very complete and very complex.

    What about the implementation team?

    We did the implementation with IBM.

    What's my experience with pricing, setup cost, and licensing?

    The setup cost is very expensive. The cost depends on the pieces of the solution I'm using, how much data I have, and whether it's on the cloud or on-prem.

    Which other solutions did I evaluate?

    I've looked at Talend, Calibra, Denodo , Purview , and AWS Glue . It depends on the client's maturity in data management. If the client is only looking to do data quality as a small piece of data management, Denodo  would be an excellent choice. If they are looking for end-to-end data management and have the technical resources to get Cloud Pak running and enabled with all functionalities, then definitely Cloud Pak. The choice depends on the maturity of the company.

    What other advice do I have?

    Cloud Pak is a very, very, very good system. I'm super impressed with it. The learning curve is high, but I gain so much when I finally  figure it out.

    Overall product rating: seven out of ten.

    Yasser A.

    From Data Silos to Actionable Insights: IBM Cloud Pak for Data Delivers

    Reviewed on Jul 28, 2024
    Review provided by G2
    What do you like best about the product?
    IBM Cloud Pak for Data has become an essential tool in driving our organization's digital transformation. The platform's comprehsnive integrated approach to data management, governance, analaytics, and AI has signifcantly streamlined our data operations and empowered us to make data-driven decisions with confidence.
    What do you dislike about the product?
    While the value proposition of IBM Cloud Pak for Data is undeniable, the initial setup and configuration can be complex and time-consuming, requiring dedicated resources and expertise.
    What problems is the product solving and how is that benefiting you?
    IBM Cloud Pak for Data has been instrumental in streamlining our data operations. Its unified platform for integration, governance, and AI-powered analytics has broken down silos, mitigated risk, and enhanced our ability to derive actionable insights for improved decision-making and competitive advantage.
    Murali B

    Provides IBM Watson Catalog and data pipelines, but catalog searching needs to be improved

    Reviewed on Jul 12, 2024
    Review provided by PeerSpot

    What is most valuable?

    IBM Watson Catalog and data pipelines are the most valuable features of the solution.

    What needs improvement?

    Previously, we used to extract the information in the DSX and the XML formats. IBM Cloud Pak for Data exports information mostly on the ISX, which is an encrypted format. The only challenge with the tool is the metadata queries we try to understand.

    We have to go with the lineage and other packages that come with IBM. Previously, we created our own reports depending on the existing command line export of the mappings. The solution's catalog searching or map search needs to be improved.

    For how long have I used the solution?

    I have been using IBM Cloud Pak for Data for two years.

    What do I think about the scalability of the solution?

    We usually recommend the solution for medium and large-scale organizations.

    How are customer service and support?

    My current organization is a Gold Partner with IBM. Whenever we reach out to the support team, the turnaround time is about 24 to 48 hours, which is pretty decent.

    I rate the solution’s technical support an eight to nine out of ten.

    How would you rate customer service and support?

    Positive

    How was the initial setup?

    The solution’s initial setup is easy.

    What's my experience with pricing, setup cost, and licensing?

    The solution's pricing is competitive with that of other vendors. The pricing also depends on the number of users.

    What other advice do I have?

    If people are with the existing stuff, I would definitely suggest they go with IBM Cloud Pak for Data. I usually recommend the solution for the financial sector, where I worked for about ten years. I worked with IBM for almost eight years. Unless they want to migrate to a new product completely, I recommend IBM Cloud Pak for Data to explore current business. It is easy to integrate the tool with other solutions.

    Except for metadata queries, metadata validations, and metadata integrations, I don't see any issues with the solution. I would recommend the solution to other users if it supports their existing infrastructure.

    Some people don't want to put their data in the cloud because they are concerned about how the data is secured with encryption and decryption. For such cases, we have listed out all the pros and cons of the solution to suggest them to users.

    Overall, I rate the solution a seven out of ten.

    View all reviews