Overview
We deploy automatic node health detection and remediation on Red Hat OpenShift to reduce MTTR and prevent cluster instability. The solution combines Node Health Check (NHC) with remediation engines—such as Self Node Remediation (SNR) and/or Machine Health Check (MHC)—and applies operational guardrails (budgets, maintenance windows, security policies). Optional integrations with ServiceNow (change/incident) and Dynatrace (alerts, evidence) close the loop for run-time operations and auditability.
Highlights
- Node health policies: Conditions (e.g., NotReady/Unreachable, MemoryPressure, DiskPressure), exclusion labels (infra/control-plane), and canary mode for safe rollout
- Observability & ITSM (optional): Incidents/changes in ServiceNow with evidence; problems/events and SLOs in Dynatrace; operational runbooks with Ansible
- Practical validation: Controlled chaos testing to prove remediation is fast, safe, and auditable
Details
Unlock automation with AI agent solutions

Pricing
Custom pricing options
How can we make this page better?
Legal
Content disclaimer
Support
Vendor support
At Extreme Digital Solutions, we are committed to providing support to our clients at every stage of their project. Our dedicated team of experts is available to answer any questions, provide guidance, and offer ongoing technical support during and after implementation. We value long-term relationships with our clients and are ready to assist them in resolving any challenges or issues that may arise. Our goal is to ensure customer satisfaction by delivering prompt, professional, and personalized support to ensure the ongoing success of their cloud operations. comercial@extremedigital.com.br
Software associated with this service

