Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

16 AWS reviews

External reviews

268 reviews
from and

External reviews are not included in the AWS star rating for the product.


4-star reviews ( Show all reviews )

    Benjamin Martin

Custom dashboards and alerts have made server issue detection faster

  • October 20, 2025
  • Review from a verified AWS customer

What is our primary use case?

My main use case for Datadog is monitoring our servers.

A specific example of how I'm using Datadog to monitor my server is that we are maintaining request and latency and looking for errors.

What is most valuable?

I really enjoy the user interface of Datadog, and it makes it easy to find what I need. In my opinion, the best features Datadog offers are the customizable dashboards and the Watchdog.

The customizable dashboards and Watchdog help me in my daily work because they're easy to find and easy to look at to get the information I need. Datadog has positively impacted my organization by making finding and resolving issues a lot easier and efficient.

What needs improvement?

I think Datadog can be improved by continually finding errors and making things easy to see and customize.

For how long have I used the solution?

I have been using Datadog for one month.

What do I think about the stability of the solution?

Datadog is stable.

What do I think about the scalability of the solution?

Datadog's scalability has been easy to put on each server that we want to monitor.

How are customer service and support?

I have not had to contact customer support yet, but I've heard they are great.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We previously used our own custom solution, but Datadog is a lot easier.

What was our ROI?

I'm not sure if I've seen a return on investment.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing is that it was easy to find and easy to purchase and easy to estimate.

Which other solutions did I evaluate?

I did not make the decision to evaluate other options before choosing Datadog.

What other advice do I have?

I would rate Datadog a nine out of ten.

I give it this rating because I think just catching some of the data delays and latency live could be a little bit better, but overall, I think it's been great.

I would recommend Datadog and say that it's easy to customize and find what you're looking for.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    Dhroov Patel

Has improved incident response with better root cause visibility and supports flexible on-call scheduling

  • October 17, 2025
  • Review from a verified AWS customer

What is our primary use case?

We use Datadog for all of our observability needs and application performance monitoring. We recently transitioned our logs to Datadog. We also use it for incident management and on-call paging. We use Datadog for almost everything monitoring and observability related.

We use Datadog for figuring out the root cause of incidents. One of the more recent use cases was when we encountered a failure where one of our main microservices kept dying and couldn't give a response. Every request to it was getting a 500. We dug into some of the traces and logs, used the Kubernetes Explorer in Datadog, and found out that the application couldn't reach some metric due to its scaling. We were able to figure out the root cause because of the Kubernetes Event Explorer in Datadog. We pushed out a hotfix which restored the application to working condition.

Our incident response team leverages Datadog to page relevant on-calls for whatever service is down that's owned by that team, so they can get the appropriate SMEs and bring the service back up. That's the most common use case for our incident response. All of our teams appreciate using Datadog on-call for incident response because there are numerous notification settings to configure. The on-call schedules are very flexible with overrides and different paging rules, depending on urgency of the matter at stake.

What is most valuable?

As an administrator of Datadog, I really appreciate Fleet Automation. I also value the overall APM page for each service, including the default dashboards on the service page because they provide exactly what you need to see in terms of request errors and duration latency. These two are probably my favorite features because the service page gives a perfect look at everything you'd want to see for a service immediately, and then you can scroll down and see more infrastructure specific metrics. If it's a Java app, you can see JVM metrics. Fleet Automation really helps me as an administrator because I can see exactly what's going on with each of my agents.

My SRE team is responsible for upgrading and maintaining the agents, and with Fleet Automation, we've been able to leverage remote agent upgrades, which is fantastic because we no longer need to deploy to our servers individually, saving us considerable time. We can see all the integration errors on Fleet Automation, which is super helpful for our product teams to figure out why certain metrics aren't showing up when enabling certain integrations. On Fleet Automation, we can see each variant of the Datadog configuration we have on each host, which is very useful as we can try to synchronize all of them to the same version and configuration.

The Kubernetes Explorer in Datadog is particularly valuable. It gives us a look at each live pod YAML and we can see specific metrics related to each pod. I appreciate the ability to add custom Kubernetes objects to the Orchestration Explorer. It gives our team an easier time to see pods without having to kubectl because sometimes you have permission errors related to that. Sometimes it's just quicker than using kubectl.

Our teams use Datadog more than they used their old observability tool. They're more production-aware, conscious of how their changes are impacting customers, how the changes they make to their application speed up or slow down their app, and the overall request flow. It's a much more developer-friendly tool than other observability tools.

What needs improvement?

Datadog needs to introduce more hard limits to cost. If we see a huge log spike, administrators should have more control over what happens to save costs. If a service starts logging extensively, I want the ability to automatically direct that log into the cheapest log bucket. This should be the case with many offerings. If we're seeing too much APM, we need to be aware of it and able to stop it rather than having administrators reach out to specific teams.

Datadog has become significantly slower over the last year. They could improve performance at the risk of slowing down feature work. More resources need to go into Fleet Automation because we face many problems with things such as the Ansible role to install Datadog in non-containerized hosts.

We mainly want to see performance improvements, less time spent looking at costs, the ability to trust that costs will stay reasonable, and an easier way to manage our agents. It is such a powerful tool with much potential on the horizon, but cost control, performance, and agent management need improvement. The main issues are with the administrative side rather than the actual application.

For how long have I used the solution?

I have been using Datadog for about a year and nine months.

What do I think about the stability of the solution?

We face a high amount of issues with niche-specific outages that appear to be quite common. AWS metrics being delayed is something that Datadog posts on their status page. We face a relatively high amount of Datadog issues, but they tend to be small and limited in scope.

What do I think about the scalability of the solution?

We have not experienced any scalability issues.

How are customer service and support?

I have interacted with support. Support quality varies significantly. Some support agents are fantastic, but some tickets take months to resolve.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We used Dynatrace previously, and I believe the switch was due to cost, but that decision was outside my scope as I'm not a decision-maker in that situation.

How was the initial setup?

The initial setup in Kubernetes is not particularly difficult.

What other advice do I have?

I cannot definitively say MTTR has improved as I don't have access to those numbers and don't want to make misleading statements. Developers use it significantly more than our old observability tool. We've seen some cost savings, but we have to be significantly more cost-aware with Datadog than with our previous observability tool because there's more fluctuation and variation in the cost.

One pain point is that it has caused us to spend too much time thinking about the bill. Understand that while it is an administrative hassle, it is very rewarding to developers.

On a scale of 1-10, I rate Datadog an 8 out of 10.

Which deployment model are you using for this solution?

On-premises

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    reviewer2767362

Have improved incident response and centralized observability while optimizing resource usage

  • October 16, 2025
  • Review provided by PeerSpot

What is our primary use case?

Our main use case for Datadog includes monitoring and logs, custom metrics, as well as utilizing the APM feature and synthetic tests in our day-to-day operations.

A quick specific example of how Datadog helps with our monitoring and logs comes from all our applications sending logs into Datadog for troubleshooting purposes, with alerts built on top of the logs, and for custom metrics, we send our metrics from the applications via Prometheus to Datadog, building alerts on top of those as well, sometimes sending critical alerts directly to PagerDuty.

We generally have monitors and alerts set up for our applications and specifically rely on them to check our critical business units, such as databases; in GCP, we use Cloud SQL, in AWS, we use RDS, and we also monitor Scylla databases and EC2 instances running Kafka services, which we heavily depend upon. Recently, we migrated from US one to US five, which was a significant shift, requiring us to migrate all alerts and monitors to US five and validate their functionality in the new site.

What is most valuable?

The best feature Datadog offers is its user-intuitive interface, making it very easy to track logs and custom metrics. We also appreciate the APM feature, which has helped reduce our log volumes and custom metric volumes, allowing us to turn off some custom metrics.

We recently learned how tags contribute to custom metrics volume, which led us to exclude certain tags to further reduce that volume, and we implement log indexing and exclusion filters, leaving us with much to explore and optimize in our use of Datadog as our major observability platform.

What needs improvement?

Regarding metrics showing our improvements, the MTTR has been reduced by about 40% after integrating Datadog with PagerDuty, and we've seen our costs significantly drop in the most recent renewal after three years' contract.

Operationally, we spend about 30-40% less time correlating logs and metrics across services, while potential areas for improvement in Datadog include its integration depth and providing more flexible pricing models for large metric and log volumes.

I would suggest having an external Slack channel for urgent requests, which would enable quicker access to support or a dedicated support team for our needs.

I choose eight because, while we have used Datadog for three years and experienced growth in our business and services, the cost has also increased with the growth in metrics and log volumes, and proactive cost management feedback has not been provided to help manage or budget those rising costs. Thus, I'd like to see more proactive cost management in the future, as the pricing model seems to escalate quickly with increasing metrics ingestion and monitoring across clouds. Datadog is a powerful and reliable observability platform, but there is still room for improvement in cost efficiency and usability at scale.

Regarding pricing, setup costs, and licensing, I find Datadog's pricing model transparent but scaling quickly; the base licensing for host integration is straightforward, but costs can rapidly climb as we add custom metrics and log ingestion, especially in dynamic Kubernetes or multi-cloud environments, with the pricing being moderate to high, and while cost visibility is straightforward, it could become challenging with growing workloads. The upfront setup cost is minimal, mainly involving fine-tuning dashboards, tags, and alerts, making licensing very flexible to enable features as needed.

For how long have I used the solution?

I have been working in my current field for roughly around 10 years, starting my AWS journey about 10 years ago, mainly focused on infrastructure and observability.

What do I think about the stability of the solution?

I believe Datadog is stable.

What do I think about the scalability of the solution?

Datadog's scalability is impressive, as it has the necessary integrations, supports agent-based and cloud-native solutions, and accommodates multi-cloud, multi-region features, making overall performance very good.

How are customer service and support?

Customer support has improved recently with online support available through a portal, allowing for quicker access to help.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Previously, we used Splunk SignalFx for a couple of years, switching to Datadog because of Datadog's user-intuitive interface, which was lacking in SignalFx at the time.

What was our ROI?

Datadog has had a significant positive impact on our organization overall, particularly in visibility, reliability, and cost efficiency, allowing us to centralize metrics, logs, and traces across our cloud, moving from reactive to proactive monitoring, with improvements including faster incident detection and resolution, enhanced service reliability, better cost and resource optimization, and shared dashboards providing the engineering and product teams a single source of truth for system health and performance, thus enhancing our overall observability and operational efficiency.

I believe Datadog has delivered more than its value through reduced downtime, faster recovery, and infrastructure optimization; although we sometimes miss critical alerts, overall, it has improved our team's efficiency by maybe 30% less time spent troubleshooting logs and custom metrics while providing measurable ROI through enhanced system reliability, reduced incident costs, and infrastructure spending optimization.

Which other solutions did I evaluate?

We only evaluated SignalFx before choosing Datadog, as Datadog offered simpler scaling, better management, broader integrations, and dashboards, allowing for easier monitoring of our multi-cloud setup.

What other advice do I have?

After reducing log and custom metric volumes, we notice a significant reduction in costs without any performance issues on our end, actually seeing a lot of cost reductions.

I strongly recommend using Datadog, but suggest being proactive about resource usage and tracking anomalies monthly.

I find the interview process okay, although it runs longer than I expected, exceeding the anticipated 10 minutes.

My rating for Datadog is 8 out of 10.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google


    reviewer2767335

Has helped monitor performance across services and enabled faster issue investigation with custom dashboards

  • October 16, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Datadog is monitoring performance of Grainger.com and all the components that are involved within it.

A specific example of how I use Datadog to monitor performance is finding out an issue with an internal bot that we use. We had some issues with some of the commands and we looked into the logs which showed the events from that Slack bot. This was quite useful.

I use Datadog day-to-day to monitor the performance of key services, endpoints, and resources. Currently, we have a migration project for which I created a dashboard to help visualize the performance of key services and endpoints being migrated. At a high level, it helps to capture the performance and health of the services and endpoints.

How has it helped my organization?

Datadog has impacted my organization positively as this is our main observability tool when it comes to monitoring services, traces, and all resources within key services. This is our go-to tool and it has helped us to pinpoint issues. One aspect that needs improvement about Datadog is the Watchdog. If there are any escalated conditions or errors happening, it does not indicate which service is causing the issue or which line of code is responsible unless we recreate Watchdog monitors and add the dependency of the GitHub repo to that service.

When pinpointing issues, it helps us focus on where the problem is. Sometimes it's finding a needle in a haystack, especially when it comes to network issues. This has been our key concern lately. During network outages, we don't know exactly which device has the issue, but network observability is an area we're working towards improving. For regular issues within services, we can see the errors, but we must configure the GitHub repo associated with that service to see the key issue. Overall, it helps us to pinpoint issues. While I'm not certain about the exact timing of resolution, it does help overall.

What is most valuable?

In my opinion, the best features Datadog offers are their APM traces and ability to create dashboards with many customizable metrics, from CPU to thread count to host errors by host and errors by service. Having customized dashboards is really useful, and exploring traces is one of my favorite parts.

We have a list of dashboards primarily showing the key services and APIs related to orders, generating orders, customer direct, and main customer services. Within that list, we have RUM dashboard as well, which shows us the customer impact and the performance of key services which can directly impact customers. During code red or major escalations, I refer to these dashboards for quick analysis of any issues for the services or endpoints.

What needs improvement?

To make Datadog better, it should be able to pick up error codes automatically. Currently, you have to programmatically configure every single step. In our previous tool, Dynatrace, it could pick up error codes without developers having to explicitly code that into the configuration. Sometimes the APMs are missing the exact error code and error message which is frustrating.

Some minor improvements could include adjusting unit display on dashboards. When request counts go from 900,000 to 1.5 million or 2.2 million for endpoints, the graph keeps all units in thousands rather than converting to millions, which would be more useful and visually appealing.

Datadog Watchdog hasn't been as effective as Dynatrace Davis, which pinpoints key errors or latency within a specific service and drills down to the specific endpoint. This is an area where Datadog could improve.

For how long have I used the solution?

We fully migrated to Datadog last year.

What do I think about the stability of the solution?

In my experience, Datadog is stable, though there's typically at least one or two incidents per week. This amounts to approximately four incidents per month that cause disruption. These incidents are related to log service, indexes, and metric capturing issues, which occur in the Datadog platform more frequently compared to other tools we have.

What do I think about the scalability of the solution?

Datadog's scalability for my organization is pretty straightforward. When it comes to installation, we just have to install it on the respective service hosts and configure it. There's a new way of installing these agents, though I haven't worked on it in a while, but the process is straightforward for installing.

How are customer service and support?

The customer support rates eight out of ten. They require all information upfront and there's still back and forth communication happening. Overall, they provide good service.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We switched from Dynatrace to Datadog after conducting a survey amongst team members from various service teams. We found that developers preferred using Datadog over Dynatrace. The user interface was more intuitive, modern, and more cloud-focused. Since everybody was moving to cloud, we determined that Datadog would be a suitable tool for us.

How was the initial setup?

When comparing the setup between Dynatrace and Datadog, Datadog required more time and effort. Dynatrace was more straightforward - you simply install the agent and it picks up all the traffic with minimal configuration needed for capturing specific things. Overall, the setup for Datadog was more challenging compared to Dynatrace setup.

What other advice do I have?

I would rate Datadog overall as eight out of ten.

My advice for others looking into using Datadog is to be ready to spend a lot of time setting it up and make sure you have a good plan in terms of analyzing the finances because it can easily cost a lot of money to install agents on your service hosts.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other


    reviewer2767266

Has improved incident response time through centralized log monitoring and infrastructure automation

  • October 16, 2025
  • Review from a verified AWS customer

What is our primary use case?

My main use case for Datadog is for security SIEM, log management, and log archiving.

In my daily work, we send all our logs from different cloud services and SaaS products, including Okta, GCP, AWS, GitHub, as well as virtual machines, containers, and Kubernetes clusters. We send all this data to Datadog, and we have numerous different monitors configured. This allows us to create different security features, such as security monitoring and escalate items to a security team on call to create incident response. Archiving is significant because we can always restore logs from the archive and go back in time to see what happened on that exact day. It is very helpful for us to investigate security incidents and infrastructure incidents as well.

Regarding our main use case, we use the Terraform provider for Datadog, which is probably one of the biggest benefits of using Datadog over any other similar tool because Datadog has great Terraform support. We can create all our security monitoring infrastructure using Terraform. Even if something goes wrong and the Datadog tenant becomes completely compromised or if all our monitors were to get erased for whatever reason, we can always restore all our monitoring setup through Terraform, which provides peace of mind.

What is most valuable?

The best features Datadog offers are not necessarily about having the best individual features, but rather the sheer quantity of different features they offer. I appreciate how you can reuse a query across different indexes for logs or security monitoring. The syntax remains consistent for everything, so you do not have to learn multiple languages. Similarly, for different types of monitors, you can always reuse the same templating language, which makes things much more efficient.

Datadog positively impacted our organization by making us more cautious about how we manage our logs. Before Datadog, we would ingest substantial amounts of data without considering indexing priorities. We became more strategic about what we index, particularly for security and cloud audit logs. We improved our approach to indexing retention and determining which types of logs are important. Overall, we enhanced our internal log management practices.

After implementing Datadog, we observed specific improvements in outcomes and metrics. We started analyzing our logs more thoroughly than before, identifying different patterns, and determining log importance levels. We began looking for more signals from audit logs and distinguishing between critical and non-critical information. The most significant metric improvement has been reduced incident investigation time.

What needs improvement?

Datadog can be improved by addressing billing and spend calculation methods, as it would be better if these were more straightforward. Currently, these calculations can be complex. Additionally, while we use Terraform extensively, not everything is available in Terraform. It would be beneficial to have more features supported in Terraform, particularly some security features that have been available for a while but still lack Terraform support.

For how long have I used the solution?

I have been using Datadog for about four years.

What do I think about the stability of the solution?

Datadog is very stable.

What do I think about the scalability of the solution?

Datadog's scalability is excellent. We have never encountered any issues.

How are customer service and support?

The customer support is good. I have never had any issues.

I would rate the customer support as nine out of ten.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We previously used New Relic and switched because it was not very effective.

How was the initial setup?

My experience with pricing, setup cost, and licensing indicates that it was somewhat expensive.

What was our ROI?

I have seen a return on investment with Datadog, particularly in time saved responding to incidents. Regarding staffing requirements, that metric isn't applicable for our use case since log management and security monitoring inherently require personnel to respond. However, it has definitely improved our efficiency in terms of response time, though this isn't a hard metric but rather based on experience.

Which other solutions did I evaluate?

I do not remember evaluating other options before choosing Datadog as it was a long time ago.

What other advice do I have?

I would rate Datadog an eight out of ten because while it is expensive, it offers numerous features, though sometimes it attempts to do too much.

My advice to others considering Datadog is to explore other products and calculate potential spending carefully. If Terraform support is important to your organization, then Datadog is an excellent choice. However, keep in mind that costs will increase significantly as you scale, and different features have varying pricing structures.

Overall rating: 8/10

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    Thomas Harrison

Has enabled our teams to detect application errors faster and shift company mindset toward proactive monitoring

  • October 16, 2025
  • Review from a verified AWS customer

What is our primary use case?

My main use case for Datadog is application monitoring.

Specifically for application monitoring, we monitor our production Laravel instances using APM spans and tracing.

In addition to application monitoring, I also use Datadog to monitor our log management for our applications that are both on-prem and in the cloud, as using the AWS integration.

What is most valuable?

In my experience, the best features that Datadog offers us include unprecedented visibility and the ability to dive deep on application debugging.

Datadog's visibility and debugging features help me day-to-day; specifically, we had an application that was throwing a bunch of errors causing an issue in our production database. Using Datadog, we were able to immediately isolate the error and plan around it.

Datadog has positively impacted my organization. I think it has given us not only the specific debug and error codes that we're looking for, but it has changed the entire company's mindset in how to extract value from data that's been lying around in our internal systems for years now and given everybody a new perspective on monitoring and debugging.

Since adopting Datadog, I've noticed specific outcomes. We've begun to handle our log management internally in a more efficient manner, so we've actually reduced our disk space as simplified our backup procedures and process chains using Datadog. Now that we have extracted the value from the logs and the traces and the debug logs, we no longer have to rely so much on traditional text-based logs or even digging into the code and the error files themselves.

What needs improvement?

The only improvement I would to see with Datadog is that the graphical user interface sometimes takes a little bit to load, especially when diving deep on a subject, and just a little bit more caching would help.

The largest pain point we've had with Datadog to this point was onboarding. This was partly our fault because our logs weren't really set up to be used in a modern observability platform Datadog, but I definitely would have liked to have seen more comprehensive onboarding. We had a few appointments, but the more help we get up front, the easier it is for us to get more familiar and do more things with Datadog.

At this time, I do not think there are any other improvements Datadog needs that would make my experience even better.

For how long have I used the solution?

I have been using Datadog for approximately four months now.

What do I think about the stability of the solution?

Datadog is very stable.

What do I think about the scalability of the solution?

We have not yet hit the use case to evaluate Datadog's scalability, but based off of everything else we've used with the infrastructure, I don't think there are going to be any issues with it. We did, as a trial, engage the AWS integration, and immediately it found all of our AWS resources and presented them to us. In fact, it was talking about costing and billing which we had not anticipated, but we were pleasantly surprised with.

How are customer service and support?

Customer support is excellent; I have opened and closed probably five tickets in the past few days, specifically within the past seven days. Very responsive, and the support techs are knowledgeable and responsive.

I would rate customer support an eight out of ten. The only issues that we had were really needing more educational resources to begin with to truly understand the specifics of log management and APM tracing setup, simply because those are very complicated procedures. Walking through that a couple more times with the support engineer probably would have been helpful. It was not a deal breaker or a significant pain point, but the quicker we get up with Datadog, the happier, the quicker and deeper we get with Datadog, the happier people seem to be at our organization.

Overall, the entire Datadog comprehensive experience of support, onboarding, getting everything in there, and having a good line of feedback has been exceptional. I've been in the industry over 20 years, and part of my roles has always been customer-facing. I find that Datadog's client support is very engaging, comprehensive, and thorough.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

For on-prem infrastructure monitoring, we're currently using Nagios, but that's beginning to fade as we rely more on Datadog for our infrastructure monitoring. We had used New Relic for application performance monitoring, but because of the cost associated with that and not seeing the value from it, we stopped using that about two years ago.

How was the initial setup?

We did not purchase Datadog through the AWS Marketplace; we were contacted independently by a Datadog sales agent.

My experience with pricing, setup cost, and licensing has been overall fairly positive. The on-demand/reserved pricing, we were not as cognizant as to how big the on-demand could get, especially when we were getting everything set up, but Datadog proactively took a strong hand in guiding us to getting our costs under control. I'm proud to say that we are within 1% of our projected cost budget, so that was very handy and that's happened in the last month. Very efficient and very effective working with Datadog to control cost.

What was our ROI?

In terms of time saved, I've noticed that when we're responding to potential errors or during our software deployments, it's saving us minutes at a time that quickly add up to hours, that quickly add up to days in terms of retrieving debug and application error information.

Which other solutions did I evaluate?

Before choosing Datadog, we evaluated other options including New Relic and SolarWinds.

What other advice do I have?

I would advise others looking into using Datadog to evaluate it against other competing properties and applications in the space, and really dig in. You will find that Datadog does what it's supposed to do very quickly, very efficiently, as does it more cost competitively than some of the other offerings.

Datadog is deployed in my organization in both on-prem and in public cloud scenarios.

On a scale of one to ten, I rate Datadog a nine overall.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)


    Daniel Dolan

User sessions have been monitored effectively and beta user frustration points are now identified through behavioral insights

  • October 16, 2025
  • Review from a verified AWS customer

What is our primary use case?

I think the most important feature for me in Datadog is the RUM features.

I check the efficiency of the applications that I'm supporting in Datadog and also use it to view the sessions of users.

I have some trouble doing troubleshooting in our app currently, but RUM is my main use case in Datadog.

What is most valuable?

The personalized dashboards and alerting in Datadog stand out to me, so that way you can gear your use of the product towards what's important to you.

Datadog has allowed us to ensure that we can look at how our beta testers are using our new UIs and seeing where their frustration points are, which has been important to us.

We've been using the heat map feature in Datadog to measure those frustration points.

What needs improvement?

Some templates for certain roles and things that users care about could be auto-suggested for a dashboard or alerting in Datadog.

We had limitations around RUM and our feature flag provider in Datadog because it's a back-end forward feature flag usage in our Next.js application. We had trouble hooking up our feature flags due to RUM being client-side only. This issue arose because Next.js is a front-end and back-end focused application, and it would be beneficial to send the feature flag resolution from the back-end if needed. Our feature flag provider is GrowthBook, and the way we would have to get those feature flags into Datadog was time-consuming with a lot of boilerplate. We would have to mimic feature flag resolution on the client side, so we decided to forego that.

For how long have I used the solution?

We have been using Datadog for about two or three months.

What do I think about the stability of the solution?

Datadog seems stable in my experience without any downtime or reliability issues.

What do I think about the scalability of the solution?

Datadog is scalable and I don't think we'll have problems with scalability in terms of our use case. We might face limitations with logs, but I feel we would not be reaching any of Datadog's limits.

How are customer service and support?

The customer support has been one of the best parts of Datadog.

I would rate the customer support from Datadog a 10 on a scale of 1 to 10.

I would suggest staying in close contact with your customer support representative to get the most out of Datadog.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We did not have a different solution before Datadog.

How was the initial setup?

Setup with Datadog was pretty easy.

What was our ROI?

It is too early to tell if we've seen a return on investment so far with Datadog.

What's my experience with pricing, setup cost, and licensing?

I'm not clear on pricing, but it's not a huge concern for us at the moment in terms of RUM. For the other pieces, I know that there may be some pricing that they've been looking at for APM and logs.

Which other solutions did I evaluate?

I did not evaluate other options before choosing Datadog.

What other advice do I have?

I personally don't use the personalized dashboards and alerting, but I've seen some nice use cases from others on my team. On a scale of 1-10, I rate Datadog an 8.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    Mason Wheeler

Has improved alerting speed and enabled better proactive monitoring across cloud applications

  • October 16, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Datadog is application monitoring and alerting.

A specific example of how I use Datadog for application monitoring and alerting is monitoring for storage filling up.

I also monitor services to ensure that they're running when they should be, and then I schedule downtimes for whenever they shouldn't be.

What is most valuable?

In my experience, the best features Datadog offers are integrations with ServiceNow and PagerDuty and the large variety of other third-party integrations.

The integrations with ServiceNow and PagerDuty have helped my workflow because whenever there's an issue, we can get notified quickly, and whoever is on call, if it's after hours, can be notified that there's an issue going on.

Dashboards are nice for quick and easy access to important and useful information, and logs are a great place to review information quickly and easily without connecting to the application directly.

Datadog has positively impacted my organization by allowing for a more proactive response to issues whenever they occur.

Being more proactive has helped by reducing downtime and improving our response to resolution. It has helped us limit business impact whenever there are issues that arise.

What needs improvement?

I believe Datadog could be improved because sometimes it's not the most user-friendly, and when monitors have a new metric or a service that no longer needs to be monitored, it remains in the system. It could be user error, but it would be nice to remove a specific service or part of a monitor from continuing to be monitored if there's no data being collected anymore.

Documentation sometimes is a little misleading or confusing, and there are multiple versions available, so having more up-to-date or clearer documentation regarding which version it applies to would be good.

For how long have I used the solution?

I have been using Datadog for two, two and a half years.

What do I think about the stability of the solution?

Datadog is stable.

What do I think about the scalability of the solution?

Datadog's scalability has been pretty scalable from what we've done in our organization.

How are customer service and support?

The customer support is very good; it's easy to get support on pretty much any question that we have, including being able to chat in.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We previously used LogicMonitor, and I was not involved in the discussions on why we switched.

How was the initial setup?

It's a pretty steep learning curve to start using Datadog; it takes time to really configure everything.

What was our ROI?

I would say we have seen a return on investment, but I don't have any relevant metrics.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing is that it was good; I wasn't too involved with it, but as far as I know, it was smooth.

Which other solutions did I evaluate?

Before choosing Datadog, we did evaluate other options, but I'm not sure what those options were.

What other advice do I have?

On a scale of 1-10, I rate Datadog an 8.

Which deployment model are you using for this solution?

On-premises

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    reviewer2767302

Collaboration across metrics has improved troubleshooting while high logging costs remain a concern

  • October 16, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Datadog is monitoring and collecting metrics. I use it to collect metrics from Kubernetes pod CPU and memory usage, and also logging, basically all our middleware platforms.

What is most valuable?

The best features Datadog offers are the ability to collaborate between different metrics such as logging, metrics, and APM, which helps me to pinpoint when I'm troubleshooting issues. The dashboard is very useful; I can use it to get a glance on how the system performs, and alerting is what I'm using right now to send notifications to either email or PagerDuty.

Datadog has positively impacted my organization by shortening our time to resolve incidents because it's a central place for getting all the data that we need for troubleshooting.

What needs improvement?

I think Datadog can be improved by adding anomaly detection, that would be nice. The user interface is okay, but sometimes cost is the issue because for logging, I had to actually trim down my logs because the cost is too much.

For how long have I used the solution?

I have been using Datadog for several years.

What do I think about the stability of the solution?

Datadog is stable.

What do I think about the scalability of the solution?

Datadog's scalability is quite good since it's a SaaS solution, and there are no scalability issues for me. I simply install an agent for whatever new component, server, or host I want to monitor, and then I'm good.

How are customer service and support?

The customer support is hit and miss. Sometimes they respond fairly quickly, but it depends on the person, and it may take a couple of communications for them to actually understand what I need.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I previously used some open-source solutions from other vendors before Datadog. The switch was made to get a better observability stack.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing indicates that the pricing is based on usage. When we adopt more, we get more, so everything is based on our desire to improve adoptability for the entire studio, then cost becomes a main issue.

Which other solutions did I evaluate?

Before choosing Datadog, I evaluated other options, including Dynatrace, which was approximately 10 years ago.

What other advice do I have?

My advice to others looking into using Datadog is that if cost is not a concern, I would recommend them to use it. However, if they are sensitive or concerned about how much money they want to spend, then maybe Datadog is not the solution for them.

I would rate Datadog overall as eight out of ten, though I find it too costly.

Which deployment model are you using for this solution?

On-premises


    Prakash Pandey

Has improved monitoring accuracy and enabled faster issue resolution through detailed alerting and transaction visibility

  • October 16, 2025
  • Review from a verified AWS customer

What is our primary use case?

Our main use case for Datadog is that we heavily rely on it for our infrastructure monitoring and application monitoring, including some of the browser-based application monitoring, which is RUM.

A specific example of how we use Datadog for monitoring is that we monitor our infrastructure CPU and memory utilization. Sometimes we see slowness and figure out CPU utilization was near the threshold, around 90-95%, which helped us to resolve the issue, underlying SQL problem, and that helped us to troubleshoot the issue.

In addition to our main use case, we also use RUM monitoring and synthetic monitoring, which really help us to look at our end-user sessions and proactively solve any slowness or errors spiking up.

What is most valuable?

The best feature that Datadog offers is infrastructure monitoring, where it can look at the CPU utilization, different process utilization, all the processes which are running, and alert us in advance if things are going beyond normal threshold.

I think everything about the features of Datadog is amazing. Datadog provides details up to the transactions. We can look at the transaction log too for the application, which is really helpful.

Datadog has impacted our organization positively since we were previously using AppDynamics and then we switched to Datadog. It has improved a lot in our alerting and monitoring in the infrastructure space and application space. We can monitor business transactions and take proactive action. It is really great to take actions on the issues before an end user reports it, which is a great advantage for us.

What needs improvement?

The world is moving toward artificial intelligence, so maybe we can have an inbuilt AI agent within Datadog, or maybe it exists and I have not used it.

The AI aspect would be great where we would not need to go and look at different transactions or different modules of Datadog, as AI can actually provide the data to us on Datadog UI. If we need more details, it could have a link to go to that specific module to look at more details for the application and infrastructure monitoring and alerts.

For how long have I used the solution?

I have been using Datadog for three years now.

What do I think about the stability of the solution?

Datadog is stable for our organization, and we have not seen any downtime or issues so far.

What do I think about the scalability of the solution?

Datadog's scalability has been great as it has been able to grow with our needs. As per our need, we are able to utilize different modules and there was never any need where we needed to scale anything else. We have limited our transition recording to 45 days, which helps. That is what our need is. It is really helpful and nothing additional is needed.

How are customer service and support?

We reached out to Datadog only once to find out our AMI images, which we needed for our infrastructure as a code component, and it was a great experience. We got the required information and that helped us.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Before Datadog, we previously used OpsRamp and also AppDynamics, and both of the tools we retired and moved to Datadog due to our enterprise approach to consolidate overall monitoring to Datadog.

How was the initial setup?

I gave Datadog a nine out of ten because it is amazing. All the features and functionalities are amazing. The ease of implementation was a bit difficult for us for the database servers where we have different kinds of databases. We needed different kinds of agents to be installed, and that was a bit tricky for us. I think it is not on Datadog but it is about our complex infrastructure where we have a different set of infrastructure in place, so that created a bit of trouble during the implementation.

What was our ROI?

Since using Datadog, we have seen a return on investment with a lot of savings around infrastructure monitoring, and also on the people needed to monitor overall application and infrastructure on both sides. Previously we had thirteen contractors doing the monitoring for us, which is now reduced to only five. That is a huge saving.

Which other solutions did I evaluate?

We did not evaluate other options before choosing Datadog, we went with Datadog directly.

What other advice do I have?

My advice for others looking into using Datadog is to keep exploring the tool and utilize the different modules and the different functionalities of features Datadog offers. There are multiple features and functionalities available with the Datadog agents which are really helpful and powerful to troubleshoot, alert, and monitor both applications and infrastructure.

So far, all the features I have used in Datadog are amazing. It captures all the logging information which I have, and I can include the links of Datadog transactions on my Splunk logs. It is integrated with Splunk and other platforms, which is great.

On a scale of one to ten, I rate Datadog a nine.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other