Sign in Agent Mode
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

IBM StreamSets

IBM Software

Reviews from AWS customer

3 AWS reviews

External reviews

116 reviews
from and

External reviews are not included in the AWS star rating for the product.


4-star reviews ( Show all reviews )

    Mohammad I.

Capable streaming data processing tool

  • September 06, 2023
  • Review provided by G2

What do you like best about the product?
Listed are the things which I liked most about Streamset -

a. Presence of inbuilt connectors (in-preise version) which can useful in using it for almost every source/target systems.
b. The is GUI is user friendly and it has certainly helped my platform team to create the streaming data pipeline faster )Previously we were using pyspark)
c. Alongwith tool, the Streamset support team is also excellent.
d. The availability of streamsets academy through which we an get our resources trained easily.
What do you dislike about the product?
There are lesser number of connectors available in the cloud version of Streamsets.
The inability to supports "exactly once" delivery of data creates limitation in few of the use cases.Although we have managed this through workaround but having ths ability in Streamsets will certainly help.
What problems is the product solving and how is that benefiting you?
1.It has allowed us to perform CDC on the mainframe data and put the data to KAFKA topics which can be used by multiple platforms as per their requirement.
2. It has helped to create event based real time pipeline swhich is being used to generate marketing prompts to the customers.
3.The development time (as compared to pyspark) has been reduced as it is low code GUI tool
4. It has also helped in reducing our dependency on other ELT tools e.g. Informatica DEI.


    Marketing and Advertising

Streamset make day by day easier

  • August 11, 2023
  • Review provided by G2

What do you like best about the product?
How quickly a pipeline can be deployed and made to work. On the other hand, the large number of connectors that can be used allows you to connect with almost any data source.
What do you dislike about the product?
In my particular case, I would love for Streamsets to have more direct connections to Google Cloud Platform services, such as stages to be able to directly execute workflows, or cloud functions
What problems is the product solving and how is that benefiting you?
Migrating to Streamsets platform from Streamsets Control Hub really helped us mitigate certain connectivity issues with Control Hub that our DataCollectors were having.


    Banking

A great tool to work with Streaming Data

  • August 04, 2023
  • Review provided by G2

What do you like best about the product?
1. It has got multiple inbuilt components to connect with most of the sources/targets.
2. Its ability to handle & perform transformation on streaming data easily and effectively.
3.Topologies are quite good and provide visibilty on how systems are connected & data flows across enterprise.
4.Orchestration & Scheduling jobs are quite easy.
What do you dislike about the product?
1. Debugging is bit difficult, needs slight improvement with the error message.
2. Latency should be reduced as working with large datasets takes a bit of time.
What problems is the product solving and how is that benefiting you?
It helped us in collecting & transforming of realtime data so that the same can be used to generate customized messages to the customers.
It has also helped to reduce dependency on costly informatica & Abinitio tools.


    qi t.

Graphical interface for easier use

  • August 04, 2023
  • Review provided by G2

What do you like best about the product?
Graphical interface ,make complex ETL process easier.
What do you dislike about the product?
if there is a service can change the query to code automatically
What problems is the product solving and how is that benefiting you?
easy going , and more connection can help me to connect different type of sources


    reviewer2238417

Ease of configuring and managing pipelines centrally

  • July 21, 2023
  • Review provided by PeerSpot

What is our primary use case?

We are using StreamSets to migrate our on-premise data to the cloud.

What is most valuable?

I really appreciate the numerous ready connectors available on both the source and target sides, the support for various media file formats, and the ease of configuring and managing pipelines centrally. It's like a plug-and-play setup.

What needs improvement?

StreamSets should provide a mechanism to be able to perform data quality assessment when the data is being moved from one source to the target. So the ability to validate the data against various data rules. Then, based on the failure of data quality assessment, be able to send alerts or information to help people understand the data validation issues.

For how long have I used the solution?

I have been using StreamSets for a year and a half.

What do I think about the stability of the solution?

It's reasonably stable.

What do I think about the scalability of the solution?

It's reasonably easy to scale. Around 25 to 30 end users are using this solution in our organization.

How are customer service and support?

Customer service and support are good.

How would you rate customer service and support?

Positive

How was the initial setup?

It's reasonably easy to deploy. However, since it is used at an enterprise level, it requires maintenance. So we had a maintenance contract.

In the financial industry, we have very strict regulations around deploying something in the cloud. So, it requires a lot of permission and other processes.

Just one person is enough for the maintenance.

What's my experience with pricing, setup cost, and licensing?

The pricing was reasonably economical and easy for us to afford when we engaged with StreamSets. It was not part of Software AG at that time.

What other advice do I have?

It's a very good tool. Overall, I would rate the solution an eight out of ten.


    Saket Pandey

Provides a good bifurcation rate and accuracy, and saves time and money

  • May 17, 2023
  • Review provided by PeerSpot

What is our primary use case?

We were receiving data from hospitals or any kind of healthcare service providers in the country. We were dominantly operating in the US. When we received that data, we had to classify it into different repositories or different datasets. This data was sent to different vendors, and for that, the data needed to get processed in different ways. We needed to bifurcate data at many steps with different kinds of filters. For that, we used StreamSets.

How has it helped my organization?

We could bifurcate the datasets that we received from different hospitals. We could bifurcate it on the basis of the medical requirements of the hospitals, and sometimes, on the basis of the schedule or purpose. We were obtaining data that we could then supply to some consulting firms or other sources.

StreamSets saved us time. The accuracy was pretty good, and it was definitely better than what we were using previously. Earlier, we had hired two people who were doing the job manually, and we were also using some other platform. We had to pay for them. Overall, we have saved a lot of time, and the accuracy has improved as well. We didn't calculate the time savings, but I believe we saved about three days in a week, so there were about 30% to 40% time savings.

StreamSets reduced the workload. There was a 10% to 15% reduction in the workload.

StreamSets helped us to scale our data operations. The limit at which we purchased this solution was incredible. We were never able to reach the limit that we purchased, but it helped us to increase or scale our operation. Especially in months when we received a higher number of entries, we were able to perform our work on time.

What is most valuable?

The ability to have a good bifurcation rate and fewer mistakes is valuable. In the scenario we had, when we had to bifurcate the data, we did not completely cut the data. We made a different route for one set of data, which went into a different operating system. There was also a complete set of data along with the original data that got cut, which once again went through the filtration process, and in this way, it kept on happening. Different solutions that were in place were not providing this feasibility. With the other solutions that we were using earlier, we had to reuse the data again and again from the start. It was a time-taking process.

Their support system was pretty good. When we were setting up the bifurcation protocols that we wanted to set up, we had a few support calls with them, and those were really helpful.

What needs improvement?

The design or the way they have set up the protocol is pretty good. One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing. It does not have that feature. None of the solutions provides this feature, but this is the feature that we are looking for. If we could bifurcate the data or do manual manipulation of data at any point in time, it would be a game changer.

Its initial setup could also be a bit easier.

For how long have I used the solution?

I used this solution for about a year.

What do I think about the stability of the solution?

It's a stable product. We used it for about a year, and we hardly had to shut it down.

What do I think about the scalability of the solution?

We are a medium enterprise. We only have three departments in our company, and only one of the departments is using it. Salespeople don't use it. The development people don't use it. We are the ones using it, and our job is to process the information, so only one department is using the solution. We have about 18 people in the department.

Up to medium enterprises, it's a good choice. You can scale between one million to ten million data files. I don't believe they offer the service for a hundred million or one billion datasets. It isn't too scalable for large enterprises, but for small and medium enterprises, it's good.

How are customer service and support?

I'd rate them an eight out of ten. The only reason for not giving them a ten out of ten is that if you're doing very important work and you need to get the solution the same day, it's a bit tough to have the team support you in a very short period of time. They usually give you appointments about a day or two days later. Other than that, everything is good.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We were using another solution previously. The major reason for switching to StreamSets was that we needed to scale our operations. Our prior solution could have been scaled, but the cost of scaling was a bit higher. We would have had to hire one more person to be able to scale, but we did not want to hire more people, so we decided to use a completely automated solution for this part so that it could be handled by only one of our team members. That was the primary requirement. The cost-benefit analysis was done by one of our peers. His proposal was pretty good, and everyone agreed to it.

How was the initial setup?

Its initial setup is a bit tough. You need to have the technical expertise to do that. The support team is good. They help you around, but if they could make it a bit easier, it would be better.

I believe it operates only from the cloud. We also received the data from our associations on the cloud. We processed it on the cloud, and everything happened on the cloud.

The initial setup was complex because we were not able to directly link the data we were receiving with the StreamSets solution. Linking it required us to fill in or enter some information in StreamSets, but we were not able to figure out what to enter. For that part, we needed their help.

We spent about a week. For the first three days, our team members were trying their best to do it, but then we had to schedule a meeting with them. In terms of the number of people, only one person was working with our team, and there were three people working with the product. I was also involved in the product as a product manager, but I was not directly operating that system.

It didn't require any maintenance as such. Any maintenance activities were related to our side of things. There were mistakes on our end. When we were entering different data, we had to do different configurations in the system.

What was our ROI?

We did the cost-benefit analysis before buying the solution, and it performed even better than that. We were able to replace two of our staff members who were doing this work. The cost that we paid for this solution was pretty less as compared to their salaries, so on the cost-benefit side of things, it was a good deal. We saved about two persons' manual wage, which is about $6,000 a month, and we also saved 15% of a week's time. These two were the biggest returns on the investment. The accuracy was also a bit higher.

What's my experience with pricing, setup cost, and licensing?

Its pricing is pretty much up to the mark. For smaller enterprises, it could be a big price to pay at the initial stage of operations, but the moment you have the Seed B or Seed C funding and you want to scale up your operations and aren't much worried about the funds, at that point in time, you would need a solution that could be scaled. Simultaneously, you need a solution that you don't want to use on a very long-term basis. This solution could not be applied if we were operating with all the hospital chains in the US. We were operating just with one hospital. That's why it worked pretty well, so for medium enterprises, I believe it's very good.

What other advice do I have?

To those evaluating StreamSets, I'd advise doing a cost-benefit analysis because the way of using StreamSets differs from person to person. Someone else might have a very different use case, and they may not run into profit using the solution. For us, it was a good solution because we were hiring people for this work. People were doing the job manually. We saved both time and money, so doing a cost-benefit analysis would be the best thing.

If you are looking to expand your domain or range of operations, StreamSets is very helpful. If you are just looking for a better data analytics tool that can do bifurcation on data, I believe there are other tools or services available in the market that do not focus on the expansion of operations. They focus on doing better and more complex bifurcations.

StreamSets enables you to build data pipelines without knowing how to code. After generating a few responses, you have to enter some basic syntax or code, but generally, one can do a lot of no-code stuff, which was not an important aspect for us because we were operating in the IT space, and our entire team was capable of entering all the syntaxes that were required. It was not an issue for us at any point in time. In fact, in the operations that we were performing, we only used code. When we were testing out our initial datasets, we used some no-code features that were there, but at the later stage, we used only syntaxes.

We did not connect to the messaging systems, but we connected some enterprise databases. We were operating with a set of hospitals in the US, and we had to connect with them only the first time. Afterward, it was the data that was passing through the pipeline. Initially, for a completely new user, it's a bit tricky. Some technical expertise is required. It's a bit tough, but because the support team is there, one would be able to do it.

Overall, I would rate StreamSets an eight out of ten.


    Telecommunications

StreamSets Data Collector & Transformer Review

  • September 08, 2022
  • Review provided by G2

What do you like best about the product?
Easy to learn and use for complex ETL processes.
What do you dislike about the product?
Fewer support documents online other than documentation.
What problems is the product solving and how is that benefiting you?
Below are the problems I have solved.
1. Data Collector: Collected data from on-premise sources to the cloud.
2. Applied transformations to prepare data for analytics.


    Information Technology and Services

Pleasantly surprised by its capabilities.

  • August 19, 2022
  • Review provided by G2

What do you like best about the product?
The UI canvas and choosing the different stages like processors and origin and destinations.
What do you dislike about the product?
The lineage/provenance feature needs work. I hate to compare it with Apache Nifi but this is one feature that Nifi trumps Stream Sets on.
What problems is the product solving and how is that benefiting you?
We have a lot of different formats of data and transforming it using hand-coded ETL tools or other systems is cumbersome and frustrating. Stream sets does things elegantly and in a manner that is least time-consuming.


    Tribhuban G.

Review of Working in StreamSets platform

  • August 16, 2022
  • Review provided by G2

What do you like best about the product?
The intuitive canvas for designing all the Streamsets pipelines coupled with the ease of configuration of environment values in Streamsets are very useful for a Data Architect.
What do you dislike about the product?
the datacollector has to be designed properly. Few of the components require external jars . For example a simple DB configuration like MySQLDB requires the jar for mysql connector to be installed in the Datacollector in order to use tht Datacollector for data reading purposes.
What problems is the product solving and how is that benefiting you?
Problems related to Data Analytics and real-time predictions for various real-life business use cases. This has helped in generating new business ideas and predictions of solutions.


    Paula S.

StreamSets is easy to use and maintain, has transparent appearance.

  • August 11, 2022
  • Review provided by G2

What do you like best about the product?
Very easy to follow where data goes, catch up on nodes and prepare a preview.
What do you dislike about the product?
Sometimes is not clear from the first view how to set up nodes for a new person. A site with an explanation of how each node works would be very helpful.
What problems is the product solving and how is that benefiting you?
Changing data format without using programming language.