Exceptional performance for end to end data management
What do you like best about the product?
I used Databricks to optimise customer segmentation strategy for a retail campaign. It helped me to analyse millions of records, clean the data and create the ML model based on purchasing behavior. The Delta Lake technology ensured data consistency during the process. Its ability to integrate with our Azure data lake made is easy to access datasets.
What do you dislike about the product?
Tableau integration with Databricks was challenging and I encountered issues while setting up real-time data visualisation. Despite the challenges, the platform enabled me to automate data pipelines, which saved me hours.
What problems is the product solving and how is that benefiting you?
Our operations team used Databricks to monitor and optimse supply chain performance. It has become an essential tool for us to enhance both individual productivity and team collaboration. Its impact can be felt acoss multiple projects.
Capability to integrate diverse coding languages in a single notebook greatly enhances workflow
What is our primary use case?
I am working as a data engineer at Fractal. On a daily basis, I work on Azure Cloud, and I use Databricks frequently. We have EDF pipelines and utilize Synapse for our daily tasks.
What is most valuable?
Databricks offers various courses that I can use, whether it's PySpark, Scala, or R. I can leverage all these courses in a single notebook, which is beneficial for clients as they can access various tools in one place whenever needed. This is quite significant.
I usually work with PySpark based on client requirements. After coding, I feed the Databricks notebooks into the ADF pipeline for updates. Databricks' capability to process data in parallel enhances data processing speed. Furthermore, I can connect our Databricks notebook directly with Power BI and other visualization tools like Qlik. Once we develop code, it allows us to transform raw data into visualizations for clients using analysis diagrams, which is very helpful.
What needs improvement?
As a data engineer, I see cluster failure in our Databricks user databases as a major issue. I am unsure why, however, our flow, typically involving three to four notebooks, sometimes leads to cluster failure. Despite attempts to identify the problem, there are times when the reason remains unclear. Adjusting features like worker nodes and node utilization during cluster creation could mitigate these failures.
For how long have I used the solution?
I have been using the solution for three years now.
What do I think about the stability of the solution?
Cluster failure is one of the biggest weaknesses I notice in our Databricks.
Which solution did I use previously and why did I switch?
Databricks is beneficial for cost-saving since clients I work for transitioned from AWS Cloud to Azure Cloud for this reason.
How was the initial setup?
The initial setup is very straightforward for us.
What's my experience with pricing, setup cost, and licensing?
I am not very aware of the pricing. We use three to four clusters in our project. Increasing the number or size of clusters, such as adding more workers, would result in higher costs. That's why we limit ourselves to four clusters for our business.
Which other solutions did I evaluate?
In terms of cost efficiency, it's very useful because our clients switched from AWS Cloud to Azure Databricks to save costs.
What other advice do I have?
I would rate the overall product eight out of ten.
Everything is probably good as far as I have used it, but there's room for improvement in cluster integration. Enhancing cluster capabilities while keeping costs lower would be beneficial.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
The Best Data Engineering Tool uses Delta Lake
What do you like best about the product?
This tool is very efficient because it using Delta lake. This supports ETL Pipelines and Machine Learning workflows which Guide to extract and transform data into Various forms. And i like the interactive notebooks supporting python language .
AutoML and Delta Lake is best features.
What do you dislike about the product?
This tool in begining there is complexity for using now it became simople.
What problems is the product solving and how is that benefiting you?
the problems solved this tool , hectic data analysis and processing many type of datas
The gold standard for scalable ML and Analytics
What do you like best about the product?
My team recently used Databricks to implement a machine learning model for fraud detection. We used the Delta Lake for data preprocessing and insured real time updates from our database. One of the most helpful features in Databricks is the Delta Lake functionality, which ensures data consistency. The platform supports both Python and SQL, which fills the cap between Data engineers and Analysts. This makes it easy for teams to collaborate. Customer support is another highlight as they respond quickly and provide clear guidance.
What do you dislike about the product?
While integrating Databricks with our existing Azure Data Lake, we faced issues syncing access permissions for multiple datasets. Additionally, their pricing models makes it better suited for large organisations, but for smaller teams scaling up can be expensive.
What problems is the product solving and how is that benefiting you?
In recent projects our sales and operation teams needed unified view of supply chain metrics. Using Databricks, we collected data from multiple sources and created a centralised dashboard and enabled real time reporting. This improved our decision making speeed and helped us prevent bottlenecks.
The best Bigdata Processing Tool
What do you like best about the product?
I have used this tool for past two years , the attractive feature were faster data processing and data warehousing. i can easily intergrate it with power bi so it become easy to implement it
What do you dislike about the product?
I dont like the interface of this tool , and also latency issues
What problems is the product solving and how is that benefiting you?
My main problem was processing data from clients and upload the processed data to cloud by using this , this task became very easy
The go-to platform for scalable data analysis and AI.
What do you like best about the product?
Databricks provides excellent tools for data engineering, machine learning and business analytics. The interactive notebooks makes exploring datasets straight forward, with support for multiple languages like python, SQL and Scala. We used Databricks to create a centralised data pipeline for customer sentiment analysis. With its ability to handle streaming data, we integrated Twitter feeds, customer reviews and support tickets into single databset.
What do you dislike about the product?
We faced difficulties while integrating Databricks with a on-premises database due to limited support for hybrid environment. This required building a custom connector, with took additional time.
What problems is the product solving and how is that benefiting you?
Our data science team used Databricks to build a recommendation engine for our e-commerce client. Which was only possible because of Databrick's ability to process large datasets efficiently and provide insights faster. Also, collaborative notebooks enabled team members to debug issues and refine the algorithm together.
Revolutionizing Data analytics and AI integration
What do you like best about the product?
MLflow , and coloborative notebooks are the main feature of this tool and anothere features i like about his is Data Lake Storage layer nd Auto ml model traing helps for efficient processing.
What do you dislike about the product?
I dont like the SQlanalytics feature , gives error most of time , better improving this .
What problems is the product solving and how is that benefiting you?
We using this tool for Data Warehousing and Dataprocessing in a bulk , by using this tool we can improve time efficently
Robust features for enterprise level workflows
What do you like best about the product?
Databricks is one of the most easy to use product for data scientists and engineers. The collaborative workspace is a stand out feature of this product which allows our team to share notebooks, visualise data together and manage produts. My team used Databricks to build a predictive model for customer churn analysis. Features like AutoML accelerated model development and integration with Azure Data Lake made it easy to pull in live data streams.
What do you dislike about the product?
During an implementation project, our team struggled to configure cluster settings to optimise cost and performance. We wasted time troubleshooting issues because the documentation didn't provide clear guidance on this specific case.
What problems is the product solving and how is that benefiting you?
It's scalability has been a game changer for our organisation. By leveraging its cluster computing capabilities, we can handle large data set efficiently which enables quick analysis of project and provides valuable insights.
Big Data processing using Databricks
What do you like best about the product?
the best feature in this tool is end to end machine learning life cycle ,and api for data processing. This tool also have Delta sharing and cross-functional collaboration by this big data can be processed efficiently.
What do you dislike about the product?
We cant process specific use cases in data processing and also this tool is not affordable
What problems is the product solving and how is that benefiting you?
We used this product because of data reliablility and security, this tool has high security and high speed data processing
The tool that made data management easier for us
What do you like best about the product?
One thing that strikes me about Databricks is the fact that the platform offers robust methodologies for working with big data while maintaining tremendously high efficiency. I also like how it can be set up to work with multiple jobs at once and is highly beneficial for working on different kinds of datasets in parallel. The integrated collaboration tools enable multiple authors to edit a given document simultaneously hence enhancing the flow of information in the team.
What do you dislike about the product?
The issue that Databricks could address is the insufficient tools for identifying and addressing problems. While the platform is useful, it’s sometimes not clear where errors lie, meaning that faults might not be as easy to identify. It can result in further time consumed in error analysis in large processes arrangements since these classifications are vague. At times it seems as though there is great strain to searc original solutions thereby slowing the process.
What problems is the product solving and how is that benefiting you?
It has been difficult to perform data tasks end to end but with the help of Databricks the problem has been solved. The software has highly facilitated packing of our repetitive work like data cleaning and formatting, hence save more time on crucial parts of the projects. Not only this automation has saved our time but also has made our process highly standardized later on. The aspects of speed and accuracy have, therefore, proved beneficial in helping the business make faster decisions with data.