My main use cases for Apache Kafka are for a big project where Apache Kafka is one of the main components for sending messages and receiving them.
External reviews
External reviews are not included in the AWS star rating for the product.
Enhancement of message distribution and security through diverse connection support
What is our primary use case?
What is most valuable?
I appreciate that Apache Kafka is fast and secure thanks to implementing it with AWS, allowing me to secure it on a high level. It's fast with a good connection, and the different types of connections are a good thing for me, helping our team that uses it, which is very helpful.
The impact of Apache Kafka's scalability features on my organization and data processing capabilities depends on how many messages each company wants to receive. With a high throughput, it helps to have more brokers and partitions. If you are a company that doesn't need that many messages, I won't say it will help you a lot, but on the other hand, it can change significantly.
What needs improvement?
I don't actually think about anything they could improve about Apache Kafka, as our use cases using it are more or less on the basic level, so I didn't think about any kind of improvement.
For personal preferences, since we use Managed Kafka in AWS, I would appreciate having some kind of UI integrated into Apache Kafka for connecting to it because using code to connect it is basic, but we can use a UI.
For how long have I used the solution?
I have been using Apache Kafka for about six months.
What do I think about the stability of the solution?
I use Apache Kafka topic partitioning feature for my system stability.
This feature of Apache Kafka has helped enhance our system stability when handling high volume data because we have thousands of messages in a small amount of time, so partitioning helps us distribute all the messages that we receive between all partitions, which helps us to be stable.
What was our ROI?
I have seen some ROI from Apache Kafka, although I can't recall specifics.
Which other solutions did I evaluate?
I am saying that Apache Kafka has better security than other options, even though I don't know about them because we didn't explore them. We simply knew that Apache Kafka is the base where you should use it, so we went with that.
What other advice do I have?
At this point, I don't have any specific examples to share. I don't actually remember if I have used Apache Kafka Connect for integrating various data sources and sinks within my organization. I currently don't have any examples of how it has benefited my organization.
On a scale of one to ten, I would rate Apache Kafka an eight.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Effective real-time data streaming but benefits from improved user interface
What is our primary use case?
What is most valuable?
What needs improvement?
For how long have I used the solution?
What was my experience with deployment of the solution?
How are customer service and support?
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
How was the initial setup?
What was our ROI?
What's my experience with pricing, setup cost, and licensing?
Which other solutions did I evaluate?
What other advice do I have?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Data streaming transforms real-time data movement with impressive scalability
What is our primary use case?
I worked with Apache Kafka for customers in the financial industry and OTT platforms. They use Kafka particularly for data streaming. Companies offering movie and entertainment as a service, similar to Netflix, use Kafka.
What is most valuable?
Apache Kafka offers unique data streaming. It allows the use of data in motion, allowing data to propagate from one source to another while it is in motion. This is valuable when data is not simply residing in a database.
What needs improvement?
In the data sharing space, the performance of Apache Kafka could be improved. The performance angle is critical, and while it works in milliseconds, the goal is to move towards microseconds.
For how long have I used the solution?
I started working with Kafka about five years ago while at a financial company.
What do I think about the stability of the solution?
Apache Kafka is stable. Even though enterprises often use the open-source version, there are minimal issues after configuration.
What do I think about the scalability of the solution?
Apache Kafka is very scalable. I would rate its scalability as nine out of ten. Customers have not faced issues with user growth or data streaming needs.
How are customer service and support?
The Apache community provides support for the open-source version. Despite being open-source, extensive documentation is available to resolve issues.
How would you rate customer service and support?
Positive
How was the initial setup?
The initial setup of Apache Kafka is straightforward, around an eight on a scale from one to ten. The deployment process involves configuring the publisher, subscriber, and other parameters. SaaS can be deployed from the cloud in a couple of hours.
What about the implementation team?
Since I work with the open-source version of Kafka, solutions are managed internally with the Apache documentation.
What's my experience with pricing, setup cost, and licensing?
The open-source version of Apache Kafka results in minimal costs, mainly linked to accessing documentation and limited support. Enterprises usually opt for the more cost-effective open-source edition.
What other advice do I have?
For critical business components, it is advisable to use Confluent-managed services for Kafka. However, for non-critical functions, the open-source version is sufficient.
Overall, I rate Apache Kafka as nine out of ten for its scalability and stability.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Achieves real-time data management with fast and fault-tolerant solutions
What is our primary use case?
We are always using Apache Kafka for our real-time scenarios. It helps us detect anomalies and attacks on our website through machine learning models.
What is most valuable?
We are managing our data by topics. Splitting topics is more effective for us. Apache Kafka is very fast and stable. It offers scalability with ease and also integrates well with our tools. Fault tolerance is a good feature, and it also has high throughput rates.
What needs improvement?
Config management can be better. We are always trying to find the best configs, which is a challenge.
For how long have I used the solution?
I have been working with Apache Kafka for more than four years. It has been used since the beginning of our department, maybe six years.
What do I think about the stability of the solution?
It is very stable and meets our needs consistently.
What do I think about the scalability of the solution?
If there is latency, our Kubernetes admin includes our Kafka nodes to increase scalability. Kafka provides flexibility and integrates easily with Kubernetes.
Which solution did I use previously and why did I switch?
Before Apache Airflow, I used Cron Tab. However, Apache Airflow makes it easy to follow and manage tasks, and data science departments can easily build their models or pipelines using it.
What other advice do I have?
I would rate Apache Kafka nine out of ten.
Which deployment model are you using for this solution?
Transforms data with efficient real-time analytics and has robust streaming capabilities
What is our primary use case?
Currently, I work for an observability company. We stream customer data into our cloud, digest the information, enrich it, transform it, save it, and use on-the-fly aggregation with Kafka. Previously, I worked for a security company doing normal detection using streaming with Kafka.
I also worked for a company with a data platform based on Kafka, where we ingested clickstream data and enriched it before streaming.
What is most valuable?
The most valuable feature of Kafka is the Kafka Streams client. Unlike other systems like Flink or Spark Streaming, you don't need a separate engine to do real-time transformations and analytics. The amount of data that can be streamed into the platform and the scalability are also significant benefits.
What needs improvement?
Kafka requires fine-tuning to find the best architecture, number of nodes, and partitions for your use case. It’s a trial-and-error process with no one-size-fits-all solution. Issues may arise until it’s appropriately tuned.
While it can scale out efficiently, scaling down is more challenging, making deleting data or reducing activity harder.
For how long have I used the solution?
I have been working with the Kafka product for more than ten years.
What do I think about the stability of the solution?
Since Kafka is written in Java, it's not as stable as it should be on the JVM. The stability depends on fine-tuning the system to find the best architecture for your use case. However, the replication factor helps avoid data loss despite the stability issues.
What do I think about the scalability of the solution?
Kafka's architecture allows for scalability by adding nodes and partitions to topics. However, it's not as effective in scaling in, making reducing activity and deleting data harder.
Scalability can be managed both manually and automatically to meet demands.
Which solution did I use previously and why did I switch?
I used to work with Spark Streaming and Flink, however, not in the past year.
How was the initial setup?
If you are unfamiliar with Kafka, setting up the cluster can be quite difficult. You need to understand the architecture and components and compute the data volume upfront. For experienced individuals, the setup is less difficult yet still requires preparation.
What was our ROI?
From a time-saving perspective, onboarding new customers is straightforward, requiring them merely to stream their data into our platform.
What's my experience with pricing, setup cost, and licensing?
We use Apache Kafka, which is open-source, so we don't have fees. I can't comment on ownership costs as I am not responsible for that domain.
Which other solutions did I evaluate?
Apart from Kafka, I have experience working with Spark Streaming and Flink.
What other advice do I have?
When implementing Kafka, it's important to plan the cluster size upfront to ensure easy scalability. Adding or removing nodes can disrupt the clusters, so proper sizing and planning are key.
I would rate Kafka as a solution as a nine.
Asynchronous messaging excellence with enhanced streaming capabilities and an easy setup
What is our primary use case?
Kafka is used as a streaming platform where multiple producers and consumers exchange high load and high volume of messages asynchronously without affecting each other's performance.
It serves as an industry-standard platform for such operations. Kafka is also integrated into data system architecture for applications like monitoring events on platforms like LinkedIn to enable further analytical insights.
What is most valuable?
Kafka makes data streaming asynchronous and decouples the reliance of events on consumers.
It was the first of its kind to provide a streaming pipeline, setting a new component in the tech architecture and ecosystem. It allows continuous messaging without impacting performance.
What needs improvement?
Confluent has improved aspects like documentation and cloud support, yet Kafka's reliance on older architectures like ZooKeeper in previous versions is a limitation.
Its language and architecture could be further improved to solve issues in consensus algorithms, as Red Panda does.
For how long have I used the solution?
I have been working with Kafka for about a year and feel comfortable using it.
What do I think about the stability of the solution?
I have not had any issues in terms of performance; however, there may be performance issues due to Java's garbage collector, which can cause memory issues if bloated.
What do I think about the scalability of the solution?
While I have not tried setting Kafka up on Docker containers, it is possible. I have only run a single-node broker for Kafka.
Which solution did I use previously and why did I switch?
I also use Redpanda, which is similar to Kafka in features, however, they differ in internal workings affecting performance and resource usage.
How was the initial setup?
The setup process is straightforward as per the documentation. It involves unpacking zip files with the necessary packages, ensuring Java and JVM are installed.
Previously, Kafka relied on ZooKeeper, requiring two configuration files. However, with the newer KRAP version, the setup does not need ZooKeeper, which simplifies the process.
What about the implementation team?
Apache Kafka was part of a college curriculum, and I set IP up myself. I found setting it up manageable.
What other advice do I have?
I definitely recommend Kafka, as it is the industry standard for streaming platforms. While Red Panda is similar, Kafka remains the stronger choice in the market for its established support and usage in big companies.
I'd rate the solution nine out of ten.
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Streamlined integration with diverse connectors enhances productivity and an open-source offering
What is our primary use case?
We primarily use Kafka for EventStream. We publish messages into the Kafka queues, and then at the other end, we consume those messages to perform tasks.
How has it helped my organization?
We have seen a return on investment with respect to improvements in productivity and ease of use, although these benefits are not easily quantified in dollar terms.
What is most valuable?
Kafka is scalable to any degree we want, and it has several connectors available for integration in multiple languages, making it easier for integration. Additionally, it is 100% stable for our use cases.
What needs improvement?
The UI used to access Kafka topics can be further improved. It's not very appealing, and there's potential for enhancement in that area.
For how long have I used the solution?
I have been using Kafka for at least five years in some form or another.
What do I think about the stability of the solution?
Kafka is 100% stable for my use cases.
What do I think about the scalability of the solution?
Kafka is scalable to a degree that suits our needs.
How are customer service and support?
We never had the opportunity to talk to Apache support as we generally obtained whatever we needed from Kafka.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We primarily use Kafka and have not worked with other similar solutions.
What about the implementation team?
Our DevOps team takes care of maintaining Kafka, and generally, only one person is needed.
What was our ROI?
We have experienced ROI in terms of productivity and ease of use, even though it's not quantifiable in dollar terms.
What's my experience with pricing, setup cost, and licensing?
We use managed Kafka, so there is a cost involved. Although Kafka is free as open source, the managed service cost is on a pay-as-you-go basis, which is decent.
What other advice do I have?
I'd rate the solution nine out of ten.
Significant cost savings with real-time processing and fast recovery
What is our primary use case?
We use Kafka for a stage event-driven process from a process perspective. Our platform is an ID platform, so after registration data is received, it has to be stored from various registration locations. The process includes stages like quality checking, consistency, format, biometric data checking, and so on. We are using both Kafka and ActiveMQ for almost three years now.
How has it helped my organization?
Using Kafka has saved us costs compared to proprietary solutions. It has allowed for significant customizations internally, which would have been much costlier with other solutions. We have saved a lot by employing Kafka and only needing basic salary agreements for our experts.
What is most valuable?
The real-time data processing capability of Apache Kafka is a significant pro. We also find Kafka to be relatively stable under large data volumes and its performance to be consistent. The convenience in setting up after major problems like data center blackouts is a notable feature.
What needs improvement?
Kafka has some limitations in terms of queue management. Specifically, it lacks the capability to handle larger queues for external system interactions. It would be beneficial if Kafka included more robust, high-capacity queue management features for integration with external systems.
For how long have I used the solution?
We have been using Kafka for almost three years now.
What do I think about the stability of the solution?
Kafka is stable even under high data sizes. Even if the data size is at the highest possible capability, Kafka executes relatively stable compared to other solutions we have used. I would rate its stability as a nine out of ten.
What do I think about the scalability of the solution?
Scalability is one of the strong suits of Kafka. Initially, we handled up to five thousand registrations of data items per day. Now we manage up to forty thousand. We have scaled Kafka on the same Kubernetes environment, and it has been very convenient without critical problems.
How are customer service and support?
We have not directly communicated with technical support for Kafka. Instead, we have relied on the open community for advice and employed senior experts to work on the platform.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We did not use any other solutions before Apache Kafka for these use cases.
How was the initial setup?
The initial setup for Kafka was a bit difficult and required additional configurations. It took us about one to two hours to deploy the whole platform, which includes other components like PostgreSQL database, antivirus, and directory management solutions. Kafka itself would take only a fraction of this time.
What about the implementation team?
Our DevOps team currently consists of four people working directly on this platform. For Kafka, we also involve one person from our infrastructure team, making it a total of five people.
What was our ROI?
We have definitely seen a return on investment in terms of cost savings. Using Kafka as an open-source solution has saved us a lot compared to proprietary solutions. We have been able to make significant customizations internally at a lower cost.
What's my experience with pricing, setup cost, and licensing?
I would rate the overall cost of using Kafka as a three out of ten, indicating that it is rather affordable, considering the benefits and savings it provides.
Which other solutions did I evaluate?
We did not evaluate other options or vendors before choosing Apache Kafka. We followed recommendations from other experts.
What other advice do I have?
Kafka is one of the most convenient tools with a fair level of performance. Its stability is impressive, allowing us to recover from data center blackouts quickly. I highly recommend it for similar needs.
I'd rate the solution nine out of ten.
Which deployment model are you using for this solution?
Experience with Confluent in my undergraduate research
Enables us to send or push messages through a specified port
What is our primary use case?
Apache Kafka is a messaging solution where you have topics to pass on your information. You can send messages to multiple topics.
How has it helped my organization?
We need to manage limited resources. Additionally, we can send or push messages through a specified port. This is a significant feature because, unlike traditional queues, Kafka uses a cluster of nodes, making it easy to integrate with various algorithms. This clustering is an advantage and a key feature of Kafka, providing good interaction and scalability.
What is most valuable?
For example, when you want to send a message to inform all your clients about a new feature, you can publish that message to a single topic in Apache Kafka. This allows all clients subscribed to that topic to receive the message. On the other hand, if you need to send billing information to a specific customer, you can publish that message on a topic dedicated to that customer. This message can then be sent as an SMS to the customer, allowing them to view it on their mobile device.
What needs improvement?
Apache Kafka is different in its design. If you have topics around the front end of clusters in the facility, it is scalable. The software is scalable to handle and process data. However, it might not be suitable for handling specific types of images or media files. Other than that, it should handle the rest of the data processing needs.
There are no multiple versions, which simplifies the process of granting access with Kaspersky. Every message is accurately delivered. However, Kafka does not support sending messages directly. You need to publish messages finalization. If you want to resend a message, you must resend it manually. Kafka does not automatically handle this. Another thing is the need for a redo option if an issue occurs. If a message is not sent properly, it can be retransmitted within the core system. You should enable the gateway in your program for it to function correctly. Messages will not be delivered or refreshed unless you enable the direct replay option in the product settings.
For how long have I used the solution?
I have been using Apache Kafka since 2020-21
How was the initial setup?
The initial setup of Apache Kafka is challenging and requires experience. Each message should always receive a response, so prioritizing traffic is essential. Furthermore, the client or consumer must always be in sync, or the message will not be processed.
What other advice do I have?
One pair of nodes is sufficient for the system. If our other system requires more than five nodes, it might not be feasible. Currently, other components are functioning as expected. The Kafka setup won't take much time.
When using Apache Kafka, it’s important to manage different environments carefully to avoid confusion. For instance, you can configure different client applications for producing and consuming messages. Ensure that the configurations for each environment (development, testing, production, etc.) are separated. This includes managing source code and data appropriately to maintain security and efficiency. Proper management of Kafka assets and operations phases is crucial for a smooth workflow.
I recommend Apache Kafka since it is extremely fast, stable and has been used for a very long time. We haven't encountered any major issues or concerns regarding its performance and customer service.
Overall, I rate the solution a nine out of ten.