AWS Marketplace: Certified Apache NiFi - Calculated Systems Reviews

Certified Apache NiFi - Calculated Systems

Calculated Systems

Reviews from AWS customer

7 AWS reviews

5 star

1
4 star

4
3 star

1
2 star

1
1 star

0

External reviews

8 reviews

from

External reviews are not included in the AWS star rating for the product.

3-star reviews ( Show all reviews )

Nishant Khandelwal

Standardized data pipelines have streamlined ETL workflows but still need clearer logs and UI

December 14, 2025
Review from a verified AWS customer

What is our primary use case?

My main use case for Apache NiFi is to use it for basic ETL pipelines before ingesting data into the data lake. For example, if we want to ingest anything into Snowflake, before actually moving it to Snowflake, we do basic data massaging and transformations through Apache NiFi. This may include consuming a file, converting that file from one format to another, breaking it into chunks, and then pushing it to Snowflake.

A specific example is that we consume all caller data through Apache NiFi by connecting to Kafka topics and consuming those messages. We then convert those messages into JSON format and push those messages to Snowflake by running Snowflake procedures, which eventually ingest the data directly into Snowflake by reading it through an S3 bucket. We also push the actual JSON messages to S3 buckets through Apache NiFi itself.

What is most valuable?

The best features Apache NiFi offers include the integration capability because we have use cases wherein we have been using Apache NiFi to integrate through and consume from multiple sources. It can connect to any APIs, NAS drives, and databases, and it can consume streaming data. We can connect to Kafka topics and queues as well through Apache NiFi. First of all, there is flexibility in consuming from multiple sources. We can also easily transfer information from one block to another. We have command and control on the controller services that we can maintain separately so that we do not have to repeat connections and create multiple connections in different processor groups.

Apache NiFi has positively impacted our organization by significantly helping us streamline the processes. Earlier, each team was creating their own ETL pipelines with no standard being followed. Apache NiFi gave us the opportunity to streamline that process. Each team can request access to Apache NiFi and be onboarded separately based on their needs. We have specifically standardized the pipeline design so that no one can deviate from that. For example, everyone is supposed to use specific variables when enabling the alerting system. We have a different tool called Moogsoft that everyone can onboard to using the InvokeHTTP processor of Apache NiFi to send their alerts to a centralized system which can generate incidents. No one is now working in silos, and there is a specific pattern being followed by everyone. Furthermore, everyone must have Apache NiFi configured so that they can consume from a specific source but eventually push the data to our strategic cloud partner, AWS, first into an AWS bucket which is common for all, and from that bucket, subsequent processing will be done in Snowflake. The standard being followed now once Apache NiFi has been introduced has allowed others to copy-paste successfully created pipelines, saving a lot of time in our overall software development life cycle by reducing redundancy and efforts.

What needs improvement?

A couple of improvements for Apache NiFi would be better logging. Sometimes when looking at logs and events, they do not always make sense. A strong recommendation is that there has to be improved logging in Apache NiFi. Secondly, when looking at the file states, the history of processed files should be more readable so that not only the centralized teams managing Apache NiFi but also application folks who are new to the platform can read how a specific document is traversing through Apache NiFi.

For how long have I used the solution?

I have been using Apache NiFi for almost four and a half years now.

What do I think about the scalability of the solution?

Apache NiFi's scalability is great; we can easily scale into it without encountering any challenges.

How are customer service and support?

Customer support for Apache NiFi has been excellent, with minimal response times whenever we raise cases that cannot be directly addressed by logs. The support team has consistently provided great assistance with processor failures and helped us create ad-hoc processors as needed.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We did not use any previous solution before Apache NiFi. Apache NiFi has significantly helped us improve our processes and streamline them, while we are also using SnapLogic for integration purposes in parallel.

I am not aware of any other solutions that were explored before onboarding Apache NiFi. Being from the analytics team, we were earlier using Tableau Prep for ETL and transformations and Informatica as well, but I am unsure if anything else was explored in the organization prior to using Apache NiFi.

What was our ROI?

Apache NiFi provides huge relief for all teams with similar use cases for ETL purposes, and it supports not just ETL but also ELT, allowing us to save significant time.

What's my experience with pricing, setup cost, and licensing?

I cannot comment on the pricing, setup cost, and licensing for Apache NiFi, but I can say there was significant time saved across the development life cycle due to reusable pipelines offered by Apache NiFi.

What other advice do I have?

The reason I rate Apache NiFi a seven is that most development folks are afraid to start using Apache NiFi because, to begin with, it does not always directly make sense. For example, there are other integration tools such as SnapLogic where you can simply search for a specific processor, but that is not the case with Apache NiFi. You need some basic understanding of which processors are there before you can fetch what you need. Better UI design should allow newcomers to search using relevant keywords, such as API, to retrieve appropriate processors, which currently is not happening, requiring some understanding of Apache NiFi. The other challenge I mentioned is having better logging, especially for processor-related logs to help newcomers navigate effectively.

I think Apache NiFi is a great tool that one should definitely explore. It is essential to perform basic checks regarding requirements, but if someone is looking for ETL and ELT functionalities needing to connect to CSVs, JSONs, Excels, and databases, they can onboard to Apache NiFi. It offers great connectivity for consuming or pushing data through queues and cloud workloads. Overall, it is an excellent product, especially for basic data massaging and processing before pushing data to Snowflake or creating reports. I rate Apache NiFi a seven out of ten.

Which deployment model are you using for this solution?

On-premises

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

SabinaZeynalova

Allows the creation and use of custom functions to achieve desired functionality but limitation in handling monthly transactions due to a lack of partitioning for dates

September 21, 2023
Review provided by PeerSpot

What is our primary use case?

One example is how Apache NiFi has helped us to create data pipelines to migrate data from Oracle to Postgres, Oracle to Oracle, Oracle to Minio, or other databases, such as relational databases, NoSQL databases, or object storage. We create templates for these pipelines so that we can easily reuse them for different data migration projects.

For example, we have a template for migrating data from Oracle to Postgres. This template uses an incremental load process. The template also checks the source and destination databases for compatibility and makes any necessary data transformations.

If our data is not more than ten terabytes, then NiFi is mostly used. But for a heavy table setup, I don't use NiFi for customers or enterprise solutions.

What is most valuable?

I use custom functions for specific features in Apache NiFi. I also use the processes available in NiFi. I can write custom functions to achieve the desired functionality, even if it is not explicitly available as a built-in NiFi feature.

What needs improvement?

Apache NiFi is slow to control and needs to be improved. I have to run many jobs and there are already large tables, which can make it difficult to control NiFi on time.

There is no one to tell me when there is an incident and my server is down. When we manually start the NiFi process, it is not always started correctly. We can write scripts to run when a message is received from Airflow saying that the firewall is not running. This script will automatically start all servers, including the application servers. It will also check the status of all my NiFi processes and send a callback message with the results. I have written down all the processes that are monitored.

We run many jobs, and there are already large tables. When we do not control NiFi on time, all reports fail for the day. So it's pretty slow to control, and it has to be improved.

In future releases, there are extra features I’d like to add. For example, NiFi is not suitable for migration, and the replication in NiFi is really not good. Because when you process ten years of data, you can't manage all the transactions; it is not enough. Moreover, the handling of monthly transactions is not enough due to a lack of partitioning for dates. And, when we grade a monthly ticket, we must process all data then rerun our ETL jobs. If it's possible, enhancing the partitioning in NiFi for features would be beneficial.

For how long have I used the solution?

I have been working with Apache NiFi for one year.

What do I think about the stability of the solution?

I would rate the stability an eight out of ten.

What do I think about the scalability of the solution?

I would rate the scalability a five out of ten because, in our experience, it doesn't scale correctly, especially if you don't use a Kubernetes system.

If you want it to be scalable, you must use Kubernetes, but in our system, it's in VM and VM disc—external and not external. Increasing disc space is a very hard process. NiFi is not easily scalable. You can increase, but decreasing is not possible. So, it is easy to scale up, but scaling down is difficult.

There are around ten end users in our company. We plan to increase the further usage.

How was the initial setup?

The initial setup is very easy. I would rate my experience with the initial setup a ten out of ten, where one point is difficult, and ten points are easy.

But if you want its custom mode and control, it's five out of ten.

For the initial setup, if you configure to custom mode, it's five points. But if you use its single-mode configuration and installation, it's ten.

What about the implementation team?

The deployment takes one week due to network access and some VM installation. Then, we install NiFi and deploy it. But, if you have all the scripts written automatically, it’s five minutes for us.

One person is enough for the deployment process. It's all about script writing in CAC, and it's one-button quick for deployment.

What's my experience with pricing, setup cost, and licensing?

I am using it open source, so it means it's free for me to use.

What other advice do I have?

If the volume is manageable, I would recommend it. Overall, I would rate the solution a six out of ten.

MoulaliNaguri

The product integrates with other applications easily, but it has fewer features compared to its competitors

August 23, 2023
Review provided by PeerSpot

What is our primary use case?

We use the solution for data streaming.

How has it helped my organization?

We use the tool to stream live data. The end users can see the real-time data.

What is most valuable?

We can integrate the tool with other applications easily.

What needs improvement?

More features must be added to the product. As compared to Kafka, the tool must be improved.

For how long have I used the solution?

I have been using the solution for two years. I am using the version that was released before the latest version.

What do I think about the stability of the solution?

The solution’s stability is good. I rate the stability a seven out of ten.

What do I think about the scalability of the solution?

More than 100 people are using the solution in our organization. We can scale the tool easily. I rate the scalability a ten out of ten.

How are customer service and support?

We find most of the solutions to our issues on the internet. We didn’t have to approach the technical support team.

How was the initial setup?

The initial setup was straightforward. We can deploy the tool easily on a single node. It won’t take much time. If it is a multi-node cluster, it will take two to three hours.

What about the implementation team?

We need two engineers to maintain the solution. These engineers maintain other solutions in our organization, too.

What's my experience with pricing, setup cost, and licensing?

The solution is open-source.

What other advice do I have?

The solution must be improved to compete with Kafka. As it is an open-source tool, it will take time to get all the functions. I would recommend the product to others. Overall, I rate the product a seven out of ten.

showing 1 - 3