
Overview
For more information or customized pricing, please email us: cpd_on_aws@wwpdl.vnet.ibm.com
IBM Cloud Pak for Data is a unified data and AI platform that connects the right data, at the right time, to the right people anywhere. Available on AWS and running on Red Hat OpenShift, the platform simplifies data access, automates data discovery and curation, and safeguards sensitive information by automating policy enforcement for all users in your organization. Make better data driven decisions and lay the foundation for AI with a data fabric that connects siloed data on premises or across multiple clouds without data movement. Discover actionable insights and apply trusted data to build, run, automate and manage AI models.
Outcomes:
- Data access and availability: Eliminate data silos and simplify your data landscape to enable faster, cost-effective extraction of value from your data.
- Data quality and governance: Apply governance solutions and methodologies to deliver trusted, business data.
- Data privacy and security: Fully understand and manage sensitive data with a pervasive privacy framework.
- Batch data integration: Design, develop and run jobs that move and transform data with powerful automated integration capabilities.
- 360 entity data: Enable agility and accelerated ROI for consolidated and governed views of critical enterprise data.
Product Version 4.7.x
Standard Min: 48 VPCs Enterprise Min: 72 VPCs
Already have a CP4D License? Deploy from the BYOL Listing today!
Highlights
- Deliver data responsibly with a data fabric. Unify and access disparate data with AutoSQL, a universal query engine. Discover and classify data in real time with Watson Knowledge Catalog. Protect sensitive data with automated policy enforcement.
- Scale trustworthy AI: Synchronize application and model pipelines while reducing drift, bias, and risk with ModelOps on Watson Studio. Monitor and govern AI models to meet regulations, manage risk and enhance transparency.
- Recognized by analysts as a Leader in core data and AI segments: The Forrester Wave™: Machine Learning Data Catalogs, Q4 2020; 2021 Gartner Magic Quadrant for Data Science and Machine Learning; The Forrester Wave™: Multi modal Predictive Analytics and Machine Learning, Q3 2020.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/month |
---|---|---|
Standard Option | Cloud Pak for Data Standard Option: 48 VPCs | $19,824.00 |
Enterprise Option | Cloud Pak for Data Enterprise Option: 72 VPCs | $59,400.00 |
Vendor refund policy
Please contact your rep for any questions.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Software as a Service (SaaS)
SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.
Resources
Support
Vendor support
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.


Standard contract
Customer reviews
Starts strong with data management capabilities but needs a demo database
What is our primary use case?
My primary use case for Cloud Pak is that I am the reference Data steward for the Africa regions in the banks where I work. My main objective is to capture the reference data in Caltech or Data and ensure that people profile or QA their data.
This is due to the fact that a large percentage of data is actually reference data, not by volume, but by the number of tables. The group-approved reference data is used to assure quality and ensure people know what they have; that's my primary use case for Cloud Pak.
What is most valuable?
There's a whole bunch of stuff I really like. I love the way that I can start at a very basic level with my data management journey by capturing my policies, justifying my data, and putting them into different categories to say this is data relating to individuals, for example, or data relating to geography. Those base-level data management components, together with the reference data, can then be reused whether I want to figure out where the data is coming from—using Nantucket, for example—or checking the quality of my data.
Often, when I check the quality of my data, I might find an issue, but that data did not originate in the system where I found the issue. So, I need to use Nantucket to track back to where that data originally came from so I can fix it at the source. I love that component of Cloud Pak.
I do not do much with the machine learning or AI pieces. It is probably because I can start at a basic level with data management: policies, rules, categories, reference data, and business terms. From there, I can work my way into a more granular level, applying all of that information on top of my actual data to understand what my data looks like, where it came from, and where it went wrong, managing it throughout the cycle.
What needs improvement?
What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated.
There are so many components to data management, and more often than not, people understand one thing really well. They may understand DataStage and how to move data around, but they do not see the impact of moving data incorrectly.
They also do not see the impact of everyone understanding a piece of data in the same way. I would love Cloud Pak to come with a demo database that illustrates the different components of data management in a logical way, so I can see the whole picture instead of just the area I'm specializing in.
It would be great if Cloud Pak, from a data modeling point of view, allowed us to import our PDMs, for example. It would be ideal to import and create business terms in Cloud Pak. The PEA would be great to create the technical data. The association between the business and the technical metadata could then be automated by pulling it through from your ACE models. The data modeling component is available in Cloud Pak.
Additionally, when it comes to Cloud Pak, even though it has the NextGen DataStage built into it, there is Cloud Pak for data integration as well. Currently, I do not think we have a full enough understanding of how CP4D and CP4I can enhance each other.
For how long have I used the solution?
I have used the solution since the end of 2021.
What do I think about the scalability of the solution?
Scalability is endless if I can pay for it. Obviously, it is just for containers, however, I have to pay more.
How are customer service and support?
The response time is quick, however, solving the problem is not always as fast. Cloud Pak is a complicated system, and it's often difficult to find the right resource in IBM to help with specific issues.
How would you rate customer service and support?
Neutral
How was the initial setup?
The setup was very complete and very complex.
What about the implementation team?
We did the implementation with IBM.
What's my experience with pricing, setup cost, and licensing?
The setup cost is very expensive. The cost depends on the pieces of the solution I'm using, how much data I have, and whether it's on the cloud or on-prem.
Which other solutions did I evaluate?
I've looked at Talend, Calibra, Denodo , Purview , and AWS Glue . It depends on the client's maturity in data management. If the client is only looking to do data quality as a small piece of data management, Denodo would be an excellent choice. If they are looking for end-to-end data management and have the technical resources to get Cloud Pak running and enabled with all functionalities, then definitely Cloud Pak. The choice depends on the maturity of the company.
What other advice do I have?
Cloud Pak is a very, very, very good system. I'm super impressed with it. The learning curve is high, but I gain so much when I finally figure it out.
Overall product rating: seven out of ten.
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Good tool for end to end data science
From Data Silos to Actionable Insights: IBM Cloud Pak for Data Delivers
Provides IBM Watson Catalog and data pipelines, but catalog searching needs to be improved
What is most valuable?
IBM Watson Catalog and data pipelines are the most valuable features of the solution.
What needs improvement?
Previously, we used to extract the information in the DSX and the XML formats. IBM Cloud Pak for Data exports information mostly on the ISX, which is an encrypted format. The only challenge with the tool is the metadata queries we try to understand.
We have to go with the lineage and other packages that come with IBM. Previously, we created our own reports depending on the existing command line export of the mappings. The solution's catalog searching or map search needs to be improved.
For how long have I used the solution?
I have been using IBM Cloud Pak for Data for two years.
What do I think about the scalability of the solution?
We usually recommend the solution for medium and large-scale organizations.
How are customer service and support?
My current organization is a Gold Partner with IBM. Whenever we reach out to the support team, the turnaround time is about 24 to 48 hours, which is pretty decent.
I rate the solution’s technical support an eight to nine out of ten.
How would you rate customer service and support?
Positive
How was the initial setup?
The solution’s initial setup is easy.
What's my experience with pricing, setup cost, and licensing?
The solution's pricing is competitive with that of other vendors. The pricing also depends on the number of users.
What other advice do I have?
If people are with the existing stuff, I would definitely suggest they go with IBM Cloud Pak for Data. I usually recommend the solution for the financial sector, where I worked for about ten years. I worked with IBM for almost eight years. Unless they want to migrate to a new product completely, I recommend IBM Cloud Pak for Data to explore current business. It is easy to integrate the tool with other solutions.
Except for metadata queries, metadata validations, and metadata integrations, I don't see any issues with the solution. I would recommend the solution to other users if it supports their existing infrastructure.
Some people don't want to put their data in the cloud because they are concerned about how the data is secured with encryption and decryption. For such cases, we have listed out all the pros and cons of the solution to suggest them to users.
Overall, I rate the solution a seven out of ten.