
External reviews
628 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Transformation Journey with Databricks Data Intelligence Platform
What do you like best about the product?
As a data engineer who has been working with Databricks for the past two years, I can honestly say the platform has completely transformed the way we approach data engineering projects. Before Databricks, me and my team often faced challenges with managing large datasets and ensuring smooth collaboration between data engineers and data scientists. There were times when workflows felt disjointed, and troubleshooting issues across different tools consumed a lot of our time.
Databricks has changed all of that. The collaborative notebooks feature, in particular, has been a game-changer. I can now work seamlessly with data scientists in real-time, troubleshooting issues and iterating on solutions much faster. For example, during a recent project, we were able to refine a machine learning model within days, thanks to the ability to easily share notebooks and quickly run experiments together. This level of collaboration used to take weeks with previous tools.
The auto-scaling feature has been a lifesaver. I vividly remember struggling with performance issues when processing large datasets on our old infrastructure. Now, Databricks automatically adjusts resources based on workload, so we never have to worry about managing compute power. This has drastically cut down on processing times. For instance, a data transformation job that used to take hours now finishes in a fraction of the time, allowing us to deliver projects faster.
Delta Lake has also been invaluable. Before we started using it, data consistency and quality were constant concerns, especially when dealing with large and varied data sources. Now, with Delta Lake, we can trust that our data is not only high quality but also easily accessible and queryable. One particular example was when we had to rebuild a complex dataset pipeline. Delta Lake allowed us to work with incremental data updates, making the process much more efficient and reliable.
In short, Databricks has greatly reduced development time and improved the overall quality of our deliveries. It’s helped me streamline complex workflows, improve collaboration across teams, and most importantly, deliver data-driven solutions faster and with greater confidence.
Databricks has changed all of that. The collaborative notebooks feature, in particular, has been a game-changer. I can now work seamlessly with data scientists in real-time, troubleshooting issues and iterating on solutions much faster. For example, during a recent project, we were able to refine a machine learning model within days, thanks to the ability to easily share notebooks and quickly run experiments together. This level of collaboration used to take weeks with previous tools.
The auto-scaling feature has been a lifesaver. I vividly remember struggling with performance issues when processing large datasets on our old infrastructure. Now, Databricks automatically adjusts resources based on workload, so we never have to worry about managing compute power. This has drastically cut down on processing times. For instance, a data transformation job that used to take hours now finishes in a fraction of the time, allowing us to deliver projects faster.
Delta Lake has also been invaluable. Before we started using it, data consistency and quality were constant concerns, especially when dealing with large and varied data sources. Now, with Delta Lake, we can trust that our data is not only high quality but also easily accessible and queryable. One particular example was when we had to rebuild a complex dataset pipeline. Delta Lake allowed us to work with incremental data updates, making the process much more efficient and reliable.
In short, Databricks has greatly reduced development time and improved the overall quality of our deliveries. It’s helped me streamline complex workflows, improve collaboration across teams, and most importantly, deliver data-driven solutions faster and with greater confidence.
What do you dislike about the product?
Cost Optimisation - While I appreciate the granular billing information provided, predicting costs for large projects or shared environments can still feel opaque. Many teams struggle to control runaway costs from idle clusters or suboptimal configurations. Introducing smarter autoscaling and recommendations tailored to our workloads would be invaluable. For instance, alerts for "idle clusters" or "cost hotspots" in our environment could proactively save budgets and improve efficiency.
Simplified Governance and Security - Managing access at fine-grained levels can be cumbersome. For example, controlling who can view versus who can execute a notebook or job often requires workarounds. Audit logs are excellent, but making sense of them for actionable insights sometimes feels like solving a puzzle. Enhanced attribute-based access control (ABAC) and more intuitive UI-based controls for permission management would greatly streamline operations.
User Experience - The collaborative notebook interface is one of Databricks' standout features, yet there are areas where it could be smoother. Collaboration is sometimes hindered when two users edit the same notebook. Version control feels basic compared to Git-based systems. Debugging within notebooks, especially for non-Python workloads, could use significant improvement. Adding inline commenting, conflict resolution tools, and robust debugging features would take the platform to the next level. A workspace-level activity feed to show what’s happening in shared projects would also be immensely helpful.
Workflow Automation - Include AI-driven insights for optimizing workflows (e.g., spotting bottlenecks or inefficiencies). Enable easier integration with external workflow automation tools.
Simplified Governance and Security - Managing access at fine-grained levels can be cumbersome. For example, controlling who can view versus who can execute a notebook or job often requires workarounds. Audit logs are excellent, but making sense of them for actionable insights sometimes feels like solving a puzzle. Enhanced attribute-based access control (ABAC) and more intuitive UI-based controls for permission management would greatly streamline operations.
User Experience - The collaborative notebook interface is one of Databricks' standout features, yet there are areas where it could be smoother. Collaboration is sometimes hindered when two users edit the same notebook. Version control feels basic compared to Git-based systems. Debugging within notebooks, especially for non-Python workloads, could use significant improvement. Adding inline commenting, conflict resolution tools, and robust debugging features would take the platform to the next level. A workspace-level activity feed to show what’s happening in shared projects would also be immensely helpful.
Workflow Automation - Include AI-driven insights for optimizing workflows (e.g., spotting bottlenecks or inefficiencies). Enable easier integration with external workflow automation tools.
What problems is the product solving and how is that benefiting you?
The Databricks Data Intelligence Platform has revolutionized how I handle data challenges by providing a unified, scalable, and collaborative environment. It simplifies processing large datasets, unifies teams across workflows, and ensures robust security and governance, enabling seamless data integration and real-time insights. With tools like Delta Lake and MLflow, it has streamlined pipeline development and machine learning, significantly improving productivity and reducing time to value. By democratizing analytics for technical and non-technical users alike, Databricks fosters a truly data-driven culture. Its flexibility, performance, and end-to-end capabilities have been instrumental in driving impactful results for my organization.
Databricks Data Intelligence Platform: ETL, Scalability, and Job Scheduling
What do you like best about the product?
ETL Pipeline automates batch and real-time data integration and quality data integration. Parallel data processing using multithreading. Scale up and scale down for optimising the cost
What do you dislike about the product?
Some SQL functions are not supported like declare, stored procedure, transaction rollback
What problems is the product solving and how is that benefiting you?
Fast ETL process, support of genie, Handling growing datasets
Performance of Databricks in Ml - Review !
What do you like best about the product?
I find that Databricks is totally fit for our requirement and budget in even middle level company like us , it uses Python which is easy to work with and databricks provides live datastream into input channels . I find lakehouse features best and also apache spark provides distributed processing for massive amount of data.
What do you dislike about the product?
It suits our company requirements but it needs a bit of patience at beginning with getting used to the processes since it integrates ml , ai and data processing.
What problems is the product solving and how is that benefiting you?
The most important role of datbricks in our industry is apache spark's distributed processing engine.Using it make simpler to us for working with this platform.It handles large pool of data for our Facebook advertisements lead. It unifies different processes that makes our task much easier and made real time processing of data simpler.
Databricks - Scalability and Performance
What do you like best about the product?
I really like Databricks Genie, It helps me to identify the error and give suggestions to resolve it.
Also If I ask to imrove the current code to faster performance Genie's suggestion are helpful. It helps to implement the ETL logic in effiecient way.
Also If I ask to imrove the current code to faster performance Genie's suggestion are helpful. It helps to implement the ETL logic in effiecient way.
What do you dislike about the product?
Most of the features which I use are helpful but some sql functionalities are not supported such as Update table using join.
What problems is the product solving and how is that benefiting you?
Switching from on-prem server to Cloud with Databricks are beneficial because of follows:
1. On prem major challenge was it's hard maintain the code version and deployment. Using Databricks it's simpler maintain the versions of code and deploy it on different environment(as it's supports GIT)
2. Easy to scale, We can easily scale up and scale down the cluster configuration which causes cost effiecncy, improve in performance in execution.
1. On prem major challenge was it's hard maintain the code version and deployment. Using Databricks it's simpler maintain the versions of code and deploy it on different environment(as it's supports GIT)
2. Easy to scale, We can easily scale up and scale down the cluster configuration which causes cost effiecncy, improve in performance in execution.
Exceptional performance for end to end data management
What do you like best about the product?
I used Databricks to optimise customer segmentation strategy for a retail campaign. It helped me to analyse millions of records, clean the data and create the ML model based on purchasing behavior. The Delta Lake technology ensured data consistency during the process. Its ability to integrate with our Azure data lake made is easy to access datasets.
What do you dislike about the product?
Tableau integration with Databricks was challenging and I encountered issues while setting up real-time data visualisation. Despite the challenges, the platform enabled me to automate data pipelines, which saved me hours.
What problems is the product solving and how is that benefiting you?
Our operations team used Databricks to monitor and optimse supply chain performance. It has become an essential tool for us to enhance both individual productivity and team collaboration. Its impact can be felt acoss multiple projects.
The gold standard for scalable ML and Analytics
What do you like best about the product?
My team recently used Databricks to implement a machine learning model for fraud detection. We used the Delta Lake for data preprocessing and insured real time updates from our database. One of the most helpful features in Databricks is the Delta Lake functionality, which ensures data consistency. The platform supports both Python and SQL, which fills the cap between Data engineers and Analysts. This makes it easy for teams to collaborate. Customer support is another highlight as they respond quickly and provide clear guidance.
What do you dislike about the product?
While integrating Databricks with our existing Azure Data Lake, we faced issues syncing access permissions for multiple datasets. Additionally, their pricing models makes it better suited for large organisations, but for smaller teams scaling up can be expensive.
What problems is the product solving and how is that benefiting you?
In recent projects our sales and operation teams needed unified view of supply chain metrics. Using Databricks, we collected data from multiple sources and created a centralised dashboard and enabled real time reporting. This improved our decision making speeed and helped us prevent bottlenecks.
Superb data analytics and Ai platform !
What do you like best about the product?
It has been very amazing in creating data pipelines for data transformation and data analysis + queries easily in dashboard. It is best for data engineers in our company , they use it daily for implementing ML and setting up workflow using Databricks.
What do you dislike about the product?
I think trial period can be bit enhanced for testing this vast platforms. In terms of functionality i see no issues.
What problems is the product solving and how is that benefiting you?
Databricks played big role in warehouse , ML feature with Ai capabilities for managing workflow in team project . Plus it is very helpful in data transformation and analysis which is very much needed.
Unparalled Speed, awesome Integration and fabulous compute
What do you like best about the product?
I have been using databricks for a more than a year now. It integrates very well with our cloud providers and divides the work in different workspaces from Dev, Test, Pre and Production environment handlings TBs worth of data seamlessly.
What do you dislike about the product?
I think the cluster activation time could be improved. Also it is slow when it comes to fetch data from legacy systems like SQL server.
That takes up a lot of time
That takes up a lot of time
What problems is the product solving and how is that benefiting you?
We use databricks as our data warehouse and also as the source that is used by data analysts in the organisation. The intelligence platform helps write code seamlessly and deliver much faster compared. We have reduced the resolve time from 2 weeks to 3-4 days.
1 person found this helpful
It is an excellent Platform for data intelligence
What do you like best about the product?
Everything was excellent ,The most important thing was the user friendly
What do you dislike about the product?
Nothing ,every thing was excellent ,No other dislilke
What problems is the product solving and how is that benefiting you?
Unified Data Management
Problem: Managing diverse data types (structured, unstructured, and semi-structured) across different storage systems (data lakes, data warehouses) often leads to silos, complexity, and inefficiency.
Solution: Databricks provides a unified platform for all types of data through Delta Lake, which combines the scalability of data lakes with the performance and governance of data warehouses.
Benefit: You get a single platform to manage both batch and streaming data efficiently, reducing complexity and improving scalability. This simplifies your pipeline and reduces costs by eliminating the need for multiple tools.
2. Collaboration Between Teams
Problem: Data engineers, data scientists, and business analysts often work in silos with different tools, which slows down collaboration and innovation.
Solution: Databricks enables collaborative development with tools like Databricks Notebooks for coding, visualization, and sharing insights in real-time across teams.
Benefit: This improves communication and accelerates the development of data-driven applications, like the music recommendation system you're building, by allowing different teams to work together seamlessly.
3. Scalability and Performance
Problem: Processing large datasets can be slow and resource-intensive with traditional data platforms, leading to performance bottlenecks.
Solution: Databricks leverages Apache Spark to provide high-performance distributed data processing, enabling you to process massive datasets quickly.
Benefit: Faster data processing means quicker insights, helping you manage large data flows more effectively in real-time pipelines like the one you are working on with Databricks.
4. Data Governance and Security
Problem: As data volumes grow, ensuring data quality, compliance, and security becomes challenging, especially in industries with strict regulations.
Solution: Databricks includes comprehensive data governance features, including data lineage tracking, access controls, and auditing capabilities, all integrated within the platform.
Benefit: This makes it easier for you to manage data governance for compliance and audit needs, ensuring secure access to data and making sure your data workflows are compliant with regulations.
5. AI and ML Enablement
Problem: Building and deploying machine learning models often requires specialized tools, which can be hard to integrate with data platforms.
Solution: Databricks integrates directly with tools like MLflow for managing the full ML lifecycle, from model training to deployment.
Benefit: This allows you to integrate machine learning models into your application easily, enabling more advanced analytics and AI-driven features such as emotion-based music recommendations.
6. Real-Time Data Processing
Problem: Many organizations struggle to process and analyze real-time data effectively.
Solution: Databricks supports real-time data streaming, enabling companies to process and analyze data as it arrives.
Benefit: For real-time applications, like the music recommendation system you’re working on, this allows instant processing of data inputs (such as user emotions or age), ensuring timely and relevant recommendations.
Problem: Managing diverse data types (structured, unstructured, and semi-structured) across different storage systems (data lakes, data warehouses) often leads to silos, complexity, and inefficiency.
Solution: Databricks provides a unified platform for all types of data through Delta Lake, which combines the scalability of data lakes with the performance and governance of data warehouses.
Benefit: You get a single platform to manage both batch and streaming data efficiently, reducing complexity and improving scalability. This simplifies your pipeline and reduces costs by eliminating the need for multiple tools.
2. Collaboration Between Teams
Problem: Data engineers, data scientists, and business analysts often work in silos with different tools, which slows down collaboration and innovation.
Solution: Databricks enables collaborative development with tools like Databricks Notebooks for coding, visualization, and sharing insights in real-time across teams.
Benefit: This improves communication and accelerates the development of data-driven applications, like the music recommendation system you're building, by allowing different teams to work together seamlessly.
3. Scalability and Performance
Problem: Processing large datasets can be slow and resource-intensive with traditional data platforms, leading to performance bottlenecks.
Solution: Databricks leverages Apache Spark to provide high-performance distributed data processing, enabling you to process massive datasets quickly.
Benefit: Faster data processing means quicker insights, helping you manage large data flows more effectively in real-time pipelines like the one you are working on with Databricks.
4. Data Governance and Security
Problem: As data volumes grow, ensuring data quality, compliance, and security becomes challenging, especially in industries with strict regulations.
Solution: Databricks includes comprehensive data governance features, including data lineage tracking, access controls, and auditing capabilities, all integrated within the platform.
Benefit: This makes it easier for you to manage data governance for compliance and audit needs, ensuring secure access to data and making sure your data workflows are compliant with regulations.
5. AI and ML Enablement
Problem: Building and deploying machine learning models often requires specialized tools, which can be hard to integrate with data platforms.
Solution: Databricks integrates directly with tools like MLflow for managing the full ML lifecycle, from model training to deployment.
Benefit: This allows you to integrate machine learning models into your application easily, enabling more advanced analytics and AI-driven features such as emotion-based music recommendations.
6. Real-Time Data Processing
Problem: Many organizations struggle to process and analyze real-time data effectively.
Solution: Databricks supports real-time data streaming, enabling companies to process and analyze data as it arrives.
Benefit: For real-time applications, like the music recommendation system you’re working on, this allows instant processing of data inputs (such as user emotions or age), ensuring timely and relevant recommendations.
it was Great!
What do you like best about the product?
he Databricks Data Intelligence Platform is highly regarded for several reasons:
Unified Data Management: It combines the best features of data lakes and data warehouses into a single platform, known as the Lakehouse. This allows for seamless management of both structured and unstructured data.
Scalability and Performance: The platform is designed to handle large-scale data processing and analytics, making it suitable for enterprises of all sizes. It offers robust scalability and high performance2.
Open Source Integration: Databricks embraces open-source technologies like Apache Spark, Delta Lake, and
Unified Data Management: It combines the best features of data lakes and data warehouses into a single platform, known as the Lakehouse. This allows for seamless management of both structured and unstructured data.
Scalability and Performance: The platform is designed to handle large-scale data processing and analytics, making it suitable for enterprises of all sizes. It offers robust scalability and high performance2.
Open Source Integration: Databricks embraces open-source technologies like Apache Spark, Delta Lake, and
What do you dislike about the product?
Cost: Some users find the pricing to be on the higher side, especially for smaller organizations or individual users.
Complexity: Despite its powerful features, the platform can be complex to set up and manage, particularly for those who are new to data engineering and analytics.
Complexity: Despite its powerful features, the platform can be complex to set up and manage, particularly for those who are new to data engineering and analytics.
What problems is the product solving and how is that benefiting you?
Data Silos: By unifying data lakes and data warehouses into a single Lakehouse architecture, Databricks eliminates data silos. This ensures that all data, whether structured or unstructured, is accessible from one platform.
Scalability Issues: The platform is designed to handle large-scale data processing, making it suitable for enterprises of all sizes.
Scalability Issues: The platform is designed to handle large-scale data processing, making it suitable for enterprises of all sizes.
showing 141 - 150