Listing Thumbnail

    Dataiku for Enterprise AI (Non U.S. Markets)

     Info
    Sold by: Dataiku 
    Deployed on AWS
    Accelerate Enterprise AI with Dataiku on AWS

    Overview

    Play video

    Dataiku is The Universal AI Platform™, empowering teams to deliver AI and analytics projects faster - all within a secure, collaborative, and governed environment.

    • Data Scientists use familiar tools to focus on high-impact work, with automation and streamlined collaboration.
    • Business Analysts get faster insights with intuitive data prep and accessible machine learning.
    • Data Teams scale projects with built-in governance and transparency.

    Built for AWS
    • Connect securely to all data sources, including Amazon S3, Amazon Redshift, and Amazon RDS.
    • Scale data and ML processing with Dataiku elastic compute powered by Amazon EKS for Python, R, Spark, and more.
    • Accelerate AI development with pre-built workflows integrating AWS AI services, such as Amazon SageMaker and Amazon Comprehend.
    • Distributed creation of advanced analytics through its visual platform, fostering greater collaboration between technical and non-technical teams.
    • Leverage the Dataiku LLM Mesh to connect to Amazon Bedrock for Chat, RAG, and Agentic workflows.

    AI at Scale, Supported Every Step
    With expert services and a robust learning platform, Dataiku helps organizations of any size adopt AI at scale - quickly and confidently.

    Highlights

    • Take full advantage of your investment in the AWS platform with Dataiku's unique push down to Amazon's storage and compute.
    • Empower more users to clean and enrich data, build advanced data pipelines and machine learning models in a visual interface.
    • Accelerate deployment on AWS, leveraging Sagemaker and Bedrock, with a fully managed service (SaaS) hosted and managed by Dataiku.

    Details

    Delivery method

    Deployed on AWS

    Unlock automation with AI agent solutions

    Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.
    AI Agents

    Features and programs

    Buyer guide

    Gain valuable insights from real users who purchased this product, powered by PeerSpot.
    Buyer guide

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Dataiku for Enterprise AI (Non U.S. Markets)

     Info
    Pricing is based on the duration and terms of your contract with the vendor. This entitles you to a specified quantity of use for the contract duration. If you choose not to renew or replace your contract before it ends, access to these entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    12-month contract (1)

     Info
    Dimension
    Description
    Cost/12 months
    Dataiku
    Contact us for pricing
    $1.00

    Vendor refund policy

    All fees are non-cancellable and non-refundable except as required by law.

    Custom pricing options

    Request a private offer to receive a custom quote.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Software as a Service (SaaS)

    SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

    Support

    Vendor support

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    10
    In ML Solutions
    Top
    10
    In Databases & Analytics Platforms, ML Solutions, Data Analytics
    Top
    10
    In Data Preparation, ML Solutions, Business Intelligence & Advanced Analytics

    Customer reviews

     Info
    Sentiment is AI generated from actual customer reviews on AWS and G2
    Reviews
    Functionality
    Ease of use
    Customer service
    Cost effectiveness
    Positive reviews
    Mixed reviews
    Negative reviews

    Overview

     Info
    AI generated from product descriptions
    Data Source Connectivity
    Secure connection to multiple AWS data sources including Amazon S3, Amazon Redshift, and Amazon RDS
    Elastic Compute Processing
    Scalable data and machine learning processing powered by Amazon EKS supporting Python, R, Spark, and multiple programming environments
    AI Service Integration
    Pre-built workflows integrating with AWS AI services like Amazon SageMaker and Amazon Comprehend
    Large Language Model Connectivity
    LLM Mesh capability for connecting to Amazon Bedrock to support Chat, Retrieval-Augmented Generation (RAG), and Agentic workflows
    Collaborative Analytics Platform
    Visual platform enabling distributed creation of advanced analytics with collaboration between technical and non-technical teams
    Data Platform Architecture
    Unified platform integrating data engineering, analytics, business intelligence, data science, and machine learning on a single architecture
    Open Source Foundation
    Built on open source data projects with support for open standards and data formats
    Lakehouse Infrastructure
    Provides a common data management approach using a lakehouse architecture running on Amazon S3
    Data Intelligence Engine
    Advanced engine capable of interpreting organizational data context and enabling broad data access across teams
    Collaborative Workflow
    Native collaboration capabilities enabling cross-functional data and AI workflow integration
    Data Workflow Automation
    Drag-and-drop interface with 300+ analytic building blocks for creating and automating data workflows
    Machine Learning Capabilities
    Automated machine learning (AutoML) and feature engineering for data science use cases across skill levels
    Data Preparation Tools
    Comprehensive data access, preparation, blending, enrichment, and statistical analytics platform
    Geospatial Analytics
    Integrated geospatial analytics capabilities for spatial data processing and analysis
    Cloud-Native Analytics
    Browser-based, cloud-native experience for building and automating data pipelines with reduced technical complexity

    Contract

     Info
    Standard contract
    No
    No
    No

    Customer reviews

    Ratings and reviews

     Info
    4
    1 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    100%
    0%
    0%
    0%
    1 AWS reviews
    |
    8 external reviews
    Star ratings include only reviews from verified AWS customers. External reviews can also include a star rating, but star ratings from external reviews are not averaged in with the AWS customer star ratings.
    Ravi-Srivastava

    Has enabled reliable data pipeline creation and supports rule-based alerts for quality monitoring

    Reviewed on Oct 14, 2025
    Review from a verified AWS customer

    What is our primary use case?

    My main use cases in Dataiku  include ensuring a strong data pipeline ingestion. We have people from data management, so we need to take care of the pipeline, their data quality, data drifting, all these things. We are taking care of it with the Dataiku  rule-based alert systems we have created.

    What is most valuable?

    The best feature in Dataiku is that once the data is connected in the underneath layer, it flows exceptionally smoothly if you know how to tweak it. If you don't know, then it will create a mess. If you know how to tweak it and make the data according to your requirement, then it will be good. If you don't know and are trying to learn on the production, then it is a disaster.

    I have used Dataiku's AutoML tools. The AutoML tools have helped me on the fly, as you can apply the machine learning models. They are continuously reading your data and then creating the feature enablement. The moment feature enablement has happened, then you can do the model registry on the fly. Those model registries can trigger your new data. Imagine whatever the data test and train that is passed. Your operational data which is coming new every day, then that feature is enabled and it will give the reasonable amount of prediction and reasonable amount of value on the column so that you can utilize those. You can consume those in the application layer.

    Dataiku's data source integration flexibility is completely up to the requirement. We are not using it for ourselves. We are using it for business teams, and they are sending the requirement and we are ingesting according to their requirement. The important thing is, imagine raw data is coming A, but they need A plus B plus C multiply by D. All those kinds of enablement we are doing with the help of Dataiku.

    Our source system, the core system, is continuously throwing the raw data on the landing layer. Then from the landing layer, we are converting those raw data and making it as a consumption layer, consumable data. With the help of this, we are doing it.

    What needs improvement?

    In terms of enhancing collaboration within my team, I would not say Dataiku is the best one because it's so expensive. We are not able to provide it to everyone. There are very few people who have the developer license and are using it. Once the data pipeline is created, then we are directly handing over that data pipeline to our user on the ingestion layer. It is not a very cost-effective solution, I must say, though it is good for developing purposes only.

    Pricing can be improved.

    For how long have I used the solution?

    I have been using this product for four years.

    What do I think about the stability of the solution?

    In my opinion, Dataiku is stable because we know how to use it. There are many unstable things happening, so it's not that only the application is stable or unstable. Even so many other things, we are facing challenges. I cannot only blame one thing.

    In terms of stabilization, if my data has no outlier creation in the raw data, then it is quite stable. I would rate it a seven.

    How are customer service and support?

    For support, I haven't created any support tickets, so I really don't know about it, but it is quite good.

    How would you rate customer service and support?

    Positive

    How was the initial setup?

    The initial setup started with HANA . Then they introduced Databricks . When Databricks  got live, then they started giving this license for Dataiku. We got the Dataiku license and learning. Everything went smoothly. Now Databricks is replaced by Snowflake . Even on Snowflake , we can do many things.

    What was our ROI?

    It is hard to say if I've seen a return on investment in Dataiku because we are far away from the monetization of the data. There are other teams who are taking care of the monetization. We are not from resource management, so it becomes very hard for us to calculate the ROIC on this at each and every application level. We are not using only Dataiku, we are using many other products.

    Which other solutions did I evaluate?

    In my opinion, it is good, not bad. I must say because I'm using many other tools as for a data operating model. It is much better than other tools because it has a clickable solution. Most of our data citizens who really don't know the coding thing can easily do things with the help of the mouse. Most of the things are working fine, so there is nothing to complain about.

    What other advice do I have?

    Overall, Dataiku is really good. I would rate it an 8 out of 10.

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    reviewer1525251

    Integration with multiple platforms enhances capabilities for diverse data applications

    Reviewed on Mar 06, 2025
    Review provided by PeerSpot

    What is our primary use case?

    My primary use case for Dataiku  is for data science, Gen  AI, and data science applications. Our AGN team also uses it for various purposes.

    What is most valuable?

    Dataiku  is highly regarded as it is a leader in the Gartner ranking. It offers most of the capabilities required for data science, MLOps, and LLMOps. Integration with public cloud and multiple other platforms is excellent. The product is easy to install and can be maintained by a single expert. It supports good functionalities that are essential in data visualization and responsible AI.

    What needs improvement?

    Dataiku's pricing is very high, and commercial transparency is a challenge. Support is also an area needing improvement. More features like LLM security, holographic encryption, and enhanced GPU integration would be beneficial.

    For how long have I used the solution?

    I have been familiar with Dataiku for the past four to five years.

    What was my experience with deployment of the solution?

    I have not encountered any deployment issues. It is very easy to install.

    What do I think about the stability of the solution?

    I have not used Dataiku at the level that would allow me to comment on performance latency for a Big Bang environment. However, the product is good, and the output meets our expectations.

    What do I think about the scalability of the solution?

    Dataiku is fully scalable, and I have not identified any limitations regarding scalability so far.

    How are customer service and support?

    The technical support from Dataiku is not good. The support team does not provide adequate assistance, and there are concerns about billing requests.

    How would you rate customer service and support?

    Negative

    Which solution did I use previously and why did I switch?

    There are many products available in the market like Converge.io, Domino Data Lab, and ClearML. Dataiku's pricing is not competitive with these solutions.

    How was the initial setup?

    The initial setup of Dataiku is very easy. A single person, if experienced, can handle the installation and maintenance.

    What was our ROI?

    Without a reduction in price, I doubt users will see a return on investment. The market is competitive, and Dataiku must adopt a consumption-based model instead of the current monthly model.

    What's my experience with pricing, setup cost, and licensing?

    The pricing for Dataiku is very high, which is its biggest downside. The model they follow is not consumption-based, making it expensive.

    Which other solutions did I evaluate?

    There are many products in the market like Converge.io, Domino Data Lab, and ClearML.

    What other advice do I have?

    Overall, Dataiku is a very good product except for the commercial aspect and the support. More features like LLM security and holographic encryption would be appreciated. I would rate the technical support three out of ten due to its current inefficacy. For pricing, on a scale of one to ten, where ten is expensive, I rate it around eight to nine. I rate the overall solution a ten.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Other
    reviewer2667285

    Drag-and-drop platform accelerates model development with distributed compute engine

    Reviewed on Feb 27, 2025
    Review provided by PeerSpot

    What is our primary use case?

    My company sells licenses for both Dataiku and Alteryx, and we have clients who use them. I engage with several companies in telecommunications, retail, and energy to assess how our clients are utilizing these platforms.

    What is most valuable?

    The most valuable feature of Dataiku, in my opinion, is the possibility to use Spark, which is a distributed compute engine. This is a feature that is usually appreciated by our customers. 

    Additionally, the automation features have been impactful, particularly in the deployment phase, as we use what Dataiku calls deployer nodes. Dataiku primarily enhances the speed at which our customers can develop or train their machine learning models since it is a drag-and-drop platform. Our clients can easily drag and drop components and use them on the spot.

    What needs improvement?

    There is room for improvement in terms of allowing for more code-based features. I would love for Dataiku to allow more flexibility with code-based components and provide the possibility to extend it by developing and integrating custom components easily with existing ones.

    For how long have I used the solution?

    I have been working with Dataiku for about three years.

    What's my experience with pricing, setup cost, and licensing?

    I find the pricing of Dataiku quite affordable for our customers, as they are usually large companies. However, it is a pricey solution and I primarily recommend it to bigger companies.

    Which other solutions did I evaluate?

    I researched products like Dataiku, Cloudera, and Databricks.

    What other advice do I have?

    I would give Dataiku an eight out of ten. Although I generally recommend Dataiku, it is mainly suited for companies that can afford it as it is a pricey solution.

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Other
    Ramon Roman Viñas

    Collaboration and traceability boost team's efficiency

    Reviewed on Jan 13, 2025
    Review provided by PeerSpot

    What is our primary use case?

    I use that IQ since I am preparing cohorts for health investment research.

    What is most valuable?

    Traceability and collaboration are essential for me. I have eight or nine engineers working together. Integration with machine learning is also crucial for us. 

    Additionally, traceability is vital since I manage many cohorts, and collaboration is key as I have multiple engineers substituting for one another.

    What needs improvement?

    I need more experience in the sector, which is health. The license is very expensive. It would be great to have an intermediate license for basic treatments that do not require extensive experience.

    For how long have I used the solution?

    I have used the solution for six or seven years.

    What do I think about the scalability of the solution?

    The solution is scalable. I rate it nine out of ten.

    How are customer service and support?

    The customer service team is helpful and responsive, more or less on time. I rated them seven out of ten.

    How would you rate customer service and support?

    Neutral

    How was the initial setup?

    Deployment should take four or five hours, yet customization takes more time.

    What about the implementation team?

    Two or three engineers took part in the installation process.

    What was our ROI?

    I do not care about financial benefits, however, I am sure they exist. It has supported our compliance with industry regulations one hundred percent.

    What's my experience with pricing, setup cost, and licensing?

    There are no extra expenses beyond the existing licensing cost.

    Which other solutions did I evaluate?

    I work with other tools but mainly with Dataiku , and I also use Python and Azure  Synapse .

    What other advice do I have?

    The user interface is useful for collaborative tools that allow multiple professionals to work together. 

    I rate the overall product as eight out of ten.

    Which deployment model are you using for this solution?

    Private Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    RichardXu

    The platform organizes workflows visually and efficiently

    Reviewed on Jan 06, 2025
    Review provided by PeerSpot

    What is our primary use case?

    Dataiku  is an AI platform that we use for oil and gas exploration. Even though I can't provide specific details, this is the primary use case for us.

    What is most valuable?

    One of the valuable features of Dataiku  is the workflow capability. It allows us to organize a workflow efficiently. The platform has a visual interface, making it much easier for educated professionals to organize their work. This feature is useful because it simplifies tasks and eliminates the need for a data scientist. If you are knowledgeable about AI, you can directly write using primitive tools like Pantera flow, PyTorch , and Scikit-learn. However, Dataiku makes this process much easier.

    What needs improvement?

    One area for improvement is the need for more capabilities similar to those provided by NVIDIA for parallel machine learning training. We still encounter some integration issues.

    For how long have I used the solution?

    We have been using Dataiku for three years.

    How are customer service and support?

    Customer service is somewhat different because Dataiku partners with local industry experts who understand the business better and provide support. It can be challenging to determine the provider of better support, however, overall, the support is good.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    We are only using Dataiku.

    What was our ROI?

    I believe the return on investment looks positive.

    Which other solutions did I evaluate?

    I considered another option that excels in parallel processing. However, it falls short in other areas. No product is perfect. If these two solutions worked together, it would be advantageous. Unfortunately, one has strengths in certain areas while the other excels in another.

    What other advice do I have?

    Why not? BHP sold the energy part to a company called Woodside. It has changed because they are now part of Woodside. 

    Overall, I rate the product eight out of ten.

    View all reviews