Complete platform but a bit confuse
What do you like best about the product?
The ease of use and the number of tools available. Without much knowledge, it is very easy to start generating value with little understanding of the tool.
What do you dislike about the product?
The usability its a bit confuse. The are a lot of amazing tools but, maybe, the best practices it's a bit confuse to deploy, however, there are a lot of knowledge outside (provided by themshelves or google)
What problems is the product solving and how is that benefiting you?
Data Intelligence Plataform is solving usability of new features and implementation.
Process large-scale data sets and integrates with Apache Spark with notebook environment
What is our primary use case?
I primarily use Databricks to process large-scale data sets with Apache Spark. My main use case is processing large data sets, such as 600 GB or 800 GB.
What is most valuable?
Databricks integrates natively with Apache Spark, which I use as a processing engine for large-scale datasets. This native integration is one of its strengths. Another strength is that the platform makes it very easy to manage resources. For example, setting up a cluster of five or fifteen nodes is straightforward with Databricks. The notebook environment is also excellent, making it easy to perform various tasks.
What needs improvement?
While Databricks allows you to upload your packages, we encountered some limitations with its capabilities, particularly with Apache Spark, which also affected Databricks. We had issues working with spatial data. You had to go through many steps to find libraries that could process spatial data in a distributed fashion.
For how long have I used the solution?
I have been using Databricks since 2018.
What do I think about the scalability of the solution?
I might have a project that runs for one or two months, and perhaps I won't use it for six months. Self-service is one of its strengths. I can shut down everything and easily spin up resources when I need to use them again. We have a dedicated group of fifty people who consistently use Databricks for analytics.
How was the initial setup?
The initial setup was very easy and took around 10-15 people. We have a data science infrastructure team helping with this.
What was our ROI?
Databricks stands out among most data platforms mainly because of its ease of use. The learning curve is not as steep, making it accessible for anyone to handle large-scale data processing on Databricks. This ease of use contributes positively to our return on investment. However, in our line of work, converting this efficiency into direct monetary gains can be challenging, given our nonprofit nature.
What's my experience with pricing, setup cost, and licensing?
We purchased high-performance laptops to reduce our reliance on the cloud. The main issue was the cost. Internally, if I used Databricks, that cost would return to my team. There was a time when my monthly cost was around ten thousand dollars, which was quite high. Due to these costs, several teams, including ours, move away from using Databricks and other cloud providers. It became prohibitive, so we invested in our high-performance computers internally instead.
What other advice do I have?
Databricks provides ease of use for me, particularly due to its seamless integration with Apache Spark. This integration simplifies the process of conducting machine learning on large-scale datasets.
I recommend this solution 100%. Overall, I rate the solution an eight out of ten.
Excelent platform
What do you like best about the product?
The best part about databricks is how easy is to start working. There is no need for setup and I can use either SQL, Python or Pyspark in the same notebook. It makes the work easier and faster.
What do you dislike about the product?
For data engineering studying purposes is rather expensive.
What problems is the product solving and how is that benefiting you?
Analysis for my team.
Helps users with data processing and analytics
What is our primary use case?
I use Databricks to manage the setting up of data lakes for SaaS.
What needs improvement?
The biggest problem associated with the product is that it is quite pricey. We cannot find a better solution than Databricks in the market currently.
For how long have I used the solution?
I have been using Databricks for a year.
What's my experience with pricing, setup cost, and licensing?
It is an expensive tool. The licensing model is a pay-as-you-go one.
What other advice do I have?
The tool helps with data processing and analytics with large-scale data or big data since it is associated with managing data at a large scale.
For my general use cases, I would say that I am not a technical person, so I cannot explain to you how the tool helps with the area of data engineering tasks.
There is another team in my company that is involved in the use of machine learning and AI features in Databricks. My team is mostly into operations. The tool is used in a multi-country project.
For example, in my company, they make some shopping decisions related to solutions based on what is the product chosen by the whole company.
I rate the tool an eight out of ten.
Databricks Review
What do you like best about the product?
The greatest upside to the Databricks Platform that it's constantly being developed. Databricks as well as other companies are developing code and utilities to run on this platform. Notably Mosaic AI, has a tool called Mosaic Composer that is a low-code acellerator for training AI models which has been very benefical to use.
What do you dislike about the product?
I dislike that Databricks is beginning to abstract some of the configurability options available. For example, Databricks serverless. I want to keep the ability to tailor a cluster and libaries specific to my use-case rather than it handled by Databricks.
What problems is the product solving and how is that benefiting you?
Decreasing time from data to model.
Databricks - Powerful Product for all Data and AI Needs
What do you like best about the product?
Databricks Data platform is single unified, democratized solution for all data needs, I really like the recently launched SQL warehouse serverless feature where you don't need to worry about ec2 machines provisioning as well in your cloud account. Apart from it, I really like unity catalog , sql alerts, databricks dashboards features and creating automated workflow(jobs flow) either via databricks api or integrating Databricks Airflow operator. The visibility that databricks provide via audit tables like infrastructure cost, checking user activities etc. is something that set it apart. Also Pandas Spark Api provides us feasiblity to use distributed computing in existing python pandas code without much changes.
What do you dislike about the product?
Sometimes there are unplanned downtime for the platform which irritates us. Some documentation pages lacks examples. The AI assistant right now only able to solve/give sql responses only for simple sql asks. In cases of complex sql and sql failures, the `Diagnose error` is not relevant.
What problems is the product solving and how is that benefiting you?
The benefit that we are getting from databricks platform is that we don't need to manage inhouse spark plus cataloging service and we as a data team can focus more on generating useful insights from raw data. Using Databricks All Purpose clusters, we are able to provide our customer (Business Analytics, Product, Backend) separate clusters to do adhoc analysis such that one system doesn't impact other one. We are able to generate quick ROI in terms of performance and less query errors while migrating from Amazon redshift to databricks
My databricks experience in my work
What do you like best about the product?
Databricks is useful for his scabilty, control large volumetry of data and also small volume.
What do you dislike about the product?
Maybe the point of attention is to improve the dashboard part.
What problems is the product solving and how is that benefiting you?
Transfer large volume of data quickly.
Databricks Review
What do you like best about the product?
I like to use it for creating my ETL Pipelines. Its features helps us to make the data transformation so easy.
What do you dislike about the product?
There is nothing to dislike as of now. It is best as it is.
What problems is the product solving and how is that benefiting you?
It helps us in creating and maintaining the ETL Pipelines. The data transformation features are very useful to have in the pipelines. Its data security and integrity is very useful for maintaing a healthy pipeline.
Databricks Review.
What do you like best about the product?
The best thing is I can choose any cloud provider for my infrastructure. The interactive environment called Notebooks which allows to write and execute codes is very easy to use. The magic commands are very much helpful. The UI is good and interactive.
What do you dislike about the product?
Nothing as of now but there can be more features added.
What problems is the product solving and how is that benefiting you?
I can choose between Single node and Multi node clutser according to our requirement which solves our problem. I can create workflows easily by connecting it to the notebooks and can also mention external parameters. Spot instances are very much cost saving. And the feature called Photon which helps in fast query performance at low cost.
Great product for data engineering - playing with data gets easy.
What do you like best about the product?
Creation of data product with databricks gets easy. we can easily deliver our product to the clients on time.
Data ingeston and data curation with great performance makes the work easy.
cost effective- Launching of clusters and gets stop after work save much cost as using other data performance tools.
Understandable architecture with a sequence helps to manage the tool.
Customer support is awesome , they helps to solve all the issues i faced.
Integration and setup of databricks using terraform helps alot to manage the workspace .
What do you dislike about the product?
It took much amount of time to restart a cluster . This should be improved.
What problems is the product solving and how is that benefiting you?
z