I do not use the tool myself. Our developers and data analysts use it.
Reviews from AWS customer
-
5 star0
-
4 star0
-
3 star0
-
2 star0
-
1 star0
External reviews
External reviews are not included in the AWS star rating for the product.
A flexible solution with good documentation and integration
What is our primary use case?
What is most valuable?
The tool's most valuable feature is a database. It supports portal APIs and offers good flexibility. While it may not be the best on the market, it is the best open-source solution we have tried. It has a development community and good documentation, though not all is published.
The tool's integration with other tools is not complex. We use it alongside Kafka and Tableau.
For how long have I used the solution?
I have been using the product for four years.
What do I think about the scalability of the solution?
Every customer I've worked with over the past few years uses ClickHouse, including many Russian companies and those related to Russia.
How are customer service and support?
I have some experience talking with the tech support team. It was an open-source project at one point, so I used community resources for help. The best way to communicate with them was through their program channel, which had support available in both English and Russian.
How was the initial setup?
Regarding the initial installation, setup, and deployment, I can say it's easy for someone with my engineering skills. I prefer managing the installation myself rather than relying on out-of-the-box solutions.
What other advice do I have?
ClickHouse is good for analytics. Using ClickHouse is beneficial if you understand its specific purpose and advantages. Many engineers and developers mistakenly think it is an alternative to AWS databases like Postgres or MySQL, but it's not. ClickHouse has a different architecture and purpose, primarily excelling at analytical queries rather than traditional CRUD operations.
If you join our team, it should be easy for you to use ClickHouse, especially if you are a developer. However, you need to read the documentation and understand the problems you are trying to solve. As an infrastructure engineer, it shouldn't be hard either.
I rate the overall solution an eight out of ten.
Which deployment model are you using for this solution?
Query engine is super fast but improvement needed in integration to third-party applications or the cloud
What is our primary use case?
Our use cases are for data analytics, both real-time and batch, and also for logging Clickstream data.
We use it in our organization. We have it in our production environment.
What is most valuable?
The query engine is super fast. We deploy ClickHouse on our Kubernetes cluster, not as a cloud subscription, so it's easy to scale with the deployment.
What needs improvement?
Some features, like connecting to third-party applications or the cloud, could be better.
For how long have I used the solution?
I have been using it for one year.
What do I think about the stability of the solution?
One issue is that you need persistent volumes. Otherwise, if one system goes down, you lose data in that cluster.
Another issue is performance. You have to make sure you have the right configurations; otherwise, it will lead to queuing where all your jobs get queued.
What do I think about the scalability of the solution?
It is a scalable product.
How are customer service and support?
You only get technical support when you take the cloud subscription. If you have it in-house, you won't get any support. If you have a cloud subscription, then the support is pretty good. You can raise a ticket from the UI, and they will respond within 24 hours.
So, the support team is pretty good but there is a little room for improvement.
How would you rate customer service and support?
Neutral
How was the initial setup?
The initial setup is pretty difficult since we deployed it in-house. We didn't use the cloud subscription, so we have to handle the deployment very carefully.
The challenge was deploying it and having the replication concept working. Another challenging feature is persistent volumes. You have to make sure the data is available on all clusters; otherwise, if one cluster goes down, you'll lose all your data. It's better to have it replicated.
We first used the cloud subscription, but we saw a possibility to reduce costs, so we tried deploying the open-source ClickHouse on-premises. That saved us money, but we didn't get all the features that come with the subscription.
What about the implementation team?
We did it in-house.
What's my experience with pricing, setup cost, and licensing?
Pricing for the cloud version is alright, not very costly or cheap.
But if you have an in-house deployment on Kubernetes or something, it's going to be very cheap since you'll be managing everything.
What other advice do I have?
I would tell other users to do a POC because it depends upon the business use case and the data. They can explore first. There's another open-source option called Apache Druid, which is a little better than ClickHouse. If that doesn't fit the use case, then they could go for ClickHouse.
Overall, I would rate the solution a seven out of ten.
If you have a real-time basis, you should take a look at ClickHouse because it works on a vector database, and the querying is super fast compared to traditional databases. So, if your use case is real-time or logging or real-time dashboarding, then ClickHouse is a tool to consider. Otherwise, if it's batch processing and you can expect some latency, then you should go for other databases.
Which deployment model are you using for this solution?
If ClickHouse was a car it would be the Lightning McQueen of data.
Strongest and Powerful DB for large scale of data analysis
Great data store solution for large data analytics
The count operation in specific is super quick.
The read time and processing are very optimal compared to other solutions.
It compresses and stores data and that saves a lot of disk space.
Clickhouse is a very good OLAP base system work more efficiently then any RDBMS like MYSQL.
Over 100 billion records in my hands
Clickhouse does a lot of things right and but it still not stable for production use.
Like they have an Engine for almost all use case.
The fastest and most powerful DB ever used
Fast, Very Very Very Fast, Rich SQL, Good compression algorithms
Detail:
1) Rich SQL syntax. There are a lot of in-build functions (including GeoDistances, Uber Hexagons support, time functions, comprehensive mah, and many others). Functions could be combined (e.g. with If: SumIf, AvgIf, etc) which is very convenient.
2) Fast. Arrays and MapReduce make CH working LIGHTNING FAST. I was extremely surprised when several GBs of data were processed in under 1 second.
3) Fast. Materialized Views work differently than in other databases, but the correct usage allows instant processing of TBs of data.
4) Efficient. Vast amount of data types and compression algorithms help you to store data extremely efficiently. Make sure to check the docs and choose the best compression types for your tables. You will be very surprised.
Overall you see very smart engineers worked on this database. It was made by engineers for engineers.
You gotta read docs first. Requires Zookeeper.
Detail:
1) I faced a lot of problems with Zookeeper, Partitioning, Sharding, and replication.
2) Learning is not that easy, but it worth it.
CH allowed us to store efficiently and process in real time. It's very fast.