Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Datadog Pro

Datadog | 1

Reviews from AWS customer

9 AWS reviews

External reviews

678 reviews
from and

External reviews are not included in the AWS star rating for the product.


    David L.

Amazing UX

  • November 08, 2024
  • Review provided by G2

What do you like best about the product?
The user experience of DataDog is amazing. The granuality and flexibility of filters and charts makes it easy to quickly sift through large amounts of data and find what you need. This is especially useful when trying to quickly troubleshoot an incidents.
What do you dislike about the product?
One annoying thing about DataDog is working with IoC configurations that are defined in Terraform.

It's tough to make changes to your configurations because you need to deploy them in order to see if you did it correctly. It's also not possible to lock down resources defined with IoC, so sometimes people will edit something via the web UI that was conficuged in Terraform and then their changes will get reverted when Terraform re-deploys and they won't know why.
What problems is the product solving and how is that benefiting you?
System observability and incident response.


    Bikash s.

Monitoring Tool for Cloud Infra

  • November 07, 2024
  • Review provided by G2

What do you like best about the product?
What I like best about Datadog is its comprehensive monitoring capabilities across multiple environments and services, especially in cloud-based infrastructures. The platform makes it easy to monitor everything in one place, from application performance and infrastructure health to logs and security. The real-time dashboards are highly customizable, allowing us to drill down into specific metrics and get a clear overview of our entire stack.
What do you dislike about the product?
As our usage grows and we monitor more hosts and services, processing costs also increase. There is a learning curve involved in creating custom queries within the log management interface.
What problems is the product solving and how is that benefiting you?
Datadog helps us solve a range of monitoring and observability challenges across our infrastructure, particularly for our EKS clusters, AWS services, and application performance. By centralizing all these metrics in one place, it’s much easier to get a clear, real-time view of what’s happening across our system.


    Sanket G.

One Stop Solution (360 degree Monitoring)

  • November 06, 2024
  • Review provided by G2

What do you like best about the product?
Ecosystem and integration!

Effective communication between different Datadog products, such as APM and Infrastructure, shows detailed traces that I can drill down into to pinpoint the issue.
What do you dislike about the product?
There is a need for more improvements and features in the Datadog security lineup. A centralized SCA and SAST is lacking. Additionally, easier integration with MS Teams and other third-party software is necessary.

Flutter support in Real User Monitoring (it is not fully supported yet).
What problems is the product solving and how is that benefiting you?
1. At one glance I can see status of my production environment.
2. If there is issue in midnight then watchdog given me exact context on where is the failure and which all microservices were involved along with CPU and RAM usage of given period.
3. If there is crash for mobile application then it directly reports and create errors error tracking which I can assign it to team.
4. If there is vulnerability in the production then it scan and report.
5. If someone is trying to hijack the system then I can block those IPs.
6. Datadog detects unusual activities and reports it.
7. We have configured reports which get sends daily at 9AM to see previous day statics for review purpose so we are proactive if there are any issues.


    Sarah Jean

Very useful custom matrics

  • October 10, 2024
  • Review from a verified AWS customer

Fair point about the open metrics integration. All metrics collected by that specific "integration" are considered custom. There are many ways to collect custom metrics. Through a open metrics endpoint, collecting metrics from logs, metrics from traces, custom check, or submitting metrics directly to the agent.


    JOSEPH ROBERT POMPA

Very useful Network Hosts

  • October 06, 2024
  • Review from a verified AWS customer

The user interface is intuitive, making it easy to manage domains, emails, and databases. The dashboard is well-organized, which is a plus for beginners who might feel overwhelmed by technical details.


    reviewer092526

Debugs slow performance with good support and a straightforward setup

  • October 01, 2024
  • Review provided by PeerSpot

What is our primary use case?

We use Datadog for monitoring the performance of our infrastructure across multiple types of hosts in multiple environments. We also use APM to monitor our applications in production. 

We have some Kubernetes clusters and multi-cloud hosts with Datadog agents installed. We have recently added RUM to monitoring our application from the user side, including replay sessions, and are hoping to use those to replace existing monitoring for errors and session replay for debugging issues in the application.

How has it helped my organization?

We have been using Datadog since I started working at the company ten years ago and it has been used for many reasons over the years. Datadog across our services has helped debug slow performance on specific parts of our application, which, in turn, allows us to provide a snappier and more performant application for our customers. 

The monitoring and alerting system has allowed our team to be aware of the issues that have come up in our production system and react faster with more tools to debug and view to keep the system online for our customers.

What is most valuable?

Datadog infrastructure monitoring has helped us identify health issues with our virtual machines, such as high load, CPU, and disk usage, as well as monitoring uptime and alerting when Kubernetes containers have a bad time staying up. Our use of Datadog's Application Monitoring, APM over the last six years or so has been crucial to identifying performance and bottleneck issues as well as alerting us when services are seeing high error rates, which have made it easier to debug when specific services may be going down.

What needs improvement?

We have found that some of the different options for filtering for logs ingestion, APM traces and span ingestion, and RUM sessions vs replay settings can be hard to discover and tough to determine how to adjust and tweak for both optimal performance and monitoring as well as for billing within the console. 

It can sometimes be difficult to determine which information is documented, as we have found inconsistencies with deprecated information, such as environment variables within the documentation.

For how long have I used the solution?

I've been using the solution for ten years.

What do I think about the stability of the solution?

The solution seems pretty stable, as we've been using it for more than a decade.

What do I think about the scalability of the solution?

The solution seems quite scalable, especially within Kubernetes. Costs are a factor.

How are customer service and support?

SUpport has been very helpful whenever we need it.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We had tried some other APM monitoring in the past, however, it was too expensive, and then we added it to Datadog since we were already using Datadog and it seemed like a good value add.

How was the initial setup?

The solution is straightforward to set up. Sometimes, it is complex to find the correct documentation.

What about the implementation team?

We handled the setup in-house.

What was our ROI?

 Our ROI is ease of mind with alerts and monitoring, as well as the ability to review and debug issues for our customers.

What's my experience with pricing, setup cost, and licensing?

Getting settled on pricing is something you want to keep an eye on, as things seem to change regularly.

Which other solutions did I evaluate?

We used New Relic previously.

What other advice do I have?

Datadog is a great service that is continually growing its solution for monitoring and security. It is easy to set up and turn on and off its features once you have instrumented agents and tailored solutions to your needs.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other


    reviewer2507895

Good RUM and APM with good observability

  • September 30, 2024
  • Review provided by PeerSpot

What is our primary use case?

We use Datadog across the enterprise for observability of infrastructure, APM, RUM, SLO management, alert management and monitoring, and other features. We're also planning on using the upcoming cloud cost management features and product analytics.

For infrastructure, we integrate with our Kube systems to show all hosts and their data.

For APM, we use it with all of our API and worker services, as well as cronjobs and other Kube deployments.

We use serverless to monitor our Cloud Functions.

We use RUM for all of our user interfaces, including web and mobile.

How has it helped my organization?

It's given us the observability we need to see what's happening in our systems, end to end. We get full stack visibility from APM and RUM, through to logging and infrastructure/host visibility. It's also becoming the basis of our incident management process in conjunction with PagerDuty.

APM is probably the most prominent place where it has helped us. APM gives us detailed data on service performance, including latency and request count. This drives all of the work that we do on SLOs and SLAs.

RUM is also prominent and is becoming the basis of our product team's vision of how our software is actually used.

What is most valuable?

APM is a fundamental part of our service management, both for viewing problems and improving latency and uptime. The latency views drive our SLOs and help us identify problems.

We also use APM and metrics to view the status of our Pub/Sub topics and queues, especially when dealing with undelivered messages.

RUM has been critical in identifying what our users are actually doing, and we'll be using the new product analytics tools to research and drive new feature development.

All of this feeds into the PagerDuty integration, which we use to drive our incident management process.

What needs improvement?

Sometimes thesolution changes features so quickly that the UI keeps moving around. The cost is pretty high. Outside of that, we've been relatively happy.

The APM service catalog is evolving fast. That said, it is redundant with our other tools and doesn't allow us to manage software maturity. However, we do link it with our other tools using the APIs, so that's helpful.

Product analytics is relatively new and based on RUM, so it will be interesting to see how it evolves.

Sometimes some of the graphs take a while to load, based on the window of data.

Some stock dashboards don't allow customization. You need to clone them first, but this can lead to an abundance of dashboards. Also, there are some things that stock dashboards do that can't yet be duplicated with custom dashboards, especially around widget organization.

The "top users" widget on the product analytics page only groups by user email, which is unfortunate, since user ID is the field we use to identify our users.

For how long have I used the solution?

I've used the solution for three and a half years.

What do I think about the stability of the solution?

The solution is pretty stable.

What do I think about the scalability of the solution?

The solution is very scalable.

How are customer service and support?

Support was excellent during the sales process, with a huge dropoff after we purchased the product. It has only recently (within the past year) they have begun to reach acceptable levels again.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We did not have a global solution. Some teams were using New Relic.

How was the initial setup?

The instructions aren't always clear, especially when dealing with multiple products across multiple languages. The tracer works very differently from one language to another.

What about the implementation team?

We handled the setup in-house.

What's my experience with pricing, setup cost, and licensing?

We have built our own set of installation instructions for our teams, to ensure consistent tagging and APM setup.

Which other solutions did I evaluate?

We did look at Dynatrace.

What other advice do I have?

The service was great during the initial testing phase. However, once we bought the product, the quality of service dropped significantly. However, in the past year or so, it has improved and is now approaching the level we'd expect based on the cost.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google


    reviewer08624379

Great documentation and learning platform with good built-in integrations

  • September 26, 2024
  • Review provided by PeerSpot

What is our primary use case?

We were looking for an all-in-one observability platform that could handle a number of different environments and products. At a basic level, we have a variety of on-premises servers (Windows/Mac/Linux) as well as a number of commercial, cloud-hosted products. 

While it's often possible to let each team rely on its own means for monitoring, we wanted something that the entire company could rally around - a unified platform that is developed and supported by the very same people, not others just slapping their name on some open source products they have no control over.

How has it helped my organization?

Datadog has effortlessly dropped in to nearly every stage of observability for us. We appreciate how it has robust cross-platform support for our IT assets, and for integrating hosted products, enabling integrations often couldn't be easier, with many of them including native dashboards and even other types of content packs. 

Over the last couple of years, we have onboarded a number of engineering teams, and each of them feels comfortable using Datadog. This gives us the ability to build organizational knowledge.

What is most valuable?

Datadog's learning platform is second to none. It's the gold standard of training resources in my mind; not only are these self-paced courses available at no charge, but you can spin up an actual Datadog environment to try out its various features. 

I just hate when other vendors try to upsell you on training beyond their (often poorly-written) documentation. Apart from that, we appreciate the variety of content that comes from Datadog's built-in integrations - for common sources, we don't have to worry about parsing, creating dashboards, or otherwise reinventing the wheel.

What needs improvement?

Datadog's roadmap can be a bit unpredictable at times. For instance, a few years ago, our rep at the time stated that Datadog had dropped its plans to develop an incident on-call platform. However, this year, they released a platform that does exactly that.

They also decided to drop chat-based support just recently. While I understand that it's often easier to work with support tickets, I do miss the easy availability of live support. 

It would be nice if Datadog continued to broaden its variety of available integrations to include even more commercial platforms because that is central to its appeal. If we're looking at a new product and there isn't a native integration, then that's more work on our part.


    reviewer0962486

Good alerts and detailed data but needs UI improvements

  • September 23, 2024
  • Review provided by PeerSpot

What is our primary use case?

I work in product design, and although we use Datadog for monitoring, etc, my use case is different as I mostly review and watch session recordings from users to gain insight into user feedback.

We watch multiple sessions per week to understand how users are using our product. From this data, we are able to hone in on specific problems that come up during the sessions. We then reach out to specific users to follow up with them via moderated testing sessions, which is very valuable for us.

How has it helped my organization?

Using Datadog has allowed us to review detailed interactions of users at a scale that leads us to make informed data-driven UX improvements as mentioned above.

Being able to pinpoint specific users via filtering is also very useful as it means when we have direct feedback from a specific user, we can follow up by watching their session back. 

The engineering team's use case for Datadog is for alerting, which is also very useful for us as it gives us visibility of how stable our platform is in various different lenses.

What is most valuable?

Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale. By capturing real-time interactions, such as clicks, scrolls, and navigation paths, we can identify patterns and trends across a large user base. This helps us pinpoint usability issues, optimize the user experience, and improve the overall experience for our users. Analyzing these recordings enables us to make data-driven decisions that enhance both functionality and user satisfaction.

What needs improvement?

I'd like the ability to see more in-depth actions on user sessions, such as where there are specific problems and rather than having to watch numerous session recordings to understand where this happens to get alerts/notifications of specific areas that users are struggling with - such as rage clicks, etc.

In terms of UI, everything is very small, which makes it quite difficult to navigate at times, especially in terms of accessibility, so I'd love for there to be more attention on this.

For how long have I used the solution?

I've used the solution for over one year.

Which solution did I use previously and why did I switch?

We did not evaluate other options. 

What's my experience with pricing, setup cost, and licensing?

I wasn't part of the decision-making process during licensing.

Which other solutions did I evaluate?

I wasn't part of the decision-making process during the evaluation stage.


    reviewer9637683

Great for logging and racing but needs better customization

  • September 23, 2024
  • Review provided by PeerSpot

What is our primary use case?

We're using the product for logging and monitoring of various services in production environments. 

It excels at providing real-time observability across a wide range of metrics, logs, and traces, making it ideal for DevOps teams and enterprises managing complex environments. 

The platform integrates seamlessly with our cloud services, but browser side logging is a little lagging. 

Dashboards are very useful for quick insights, but can be time consuming to create, and the learning curve is steep. Documentation is vast, but not as detailed as I'd like.

How has it helped my organization?

The solution has made logging and tracing a lot easier, and the RUM sessions are something we did not have previously. Datadog’s real-time alerting and anomaly detection help reduce downtime by allowing us to identify and address performance issues quickly. 

The platform’s intelligent alert system minimises noise, ensuring your team focuses on critical incidents. This results in faster Mean Time to Resolution (MTTR), improving service availability. 

It consolidates monitoring for infrastructure, applications, logs, and security into a single platform. This enables us to view and analyse data across the entire stack in one place, reducing the time spent jumping between tools.

What is most valuable?

Real user monitoring has made triaging any possible bugs our users might face a lot easier. RUM tracks actual user interactions, including page load times, clicks, and navigation flows. This gives our organization a clear picture of how our users are experiencing your application in real-world conditions, including slow-loading pages, errors, and other performance issues that affect user satisfaction. We can then easily prioritize these, and make sure we offer our users the best possible experience.

What needs improvement?

I'm not sure if this is on Datadog, however, Vercel integration is very limited. 

They need to offer better/more customization on what logs we get and making tracing possible on Edge runtime logs is a real requirement. It is extremely difficult, if not completely impossible, to get working traces and logs displayed in Datadog with our stack of Vercel, NexJs, and Datadog. This is a very common stack in front end development and the difficulty of implementing it is unacceptable. Please do something about it soon. Front end logs matter.

For how long have I used the solution?

I've used the solution for a little over a year.