Available now

Grafana + OnCallReady: AI Incident Response for Grafana Alerts

OnCallReady plugs in as a Grafana unified contact point — sitting between your dashboards and your on-call rotation, triaging every alert with AI and autonomously resolving the incidents that don't need a human before any notification fires.

How it works

Grafana's alerting engine evaluates rules and fires webhooks. Configure OnCallReady as the first contact point in your notification policy. We resolve what we can and only pass through what genuinely needs escalation.

┌─────────────────────────────────────────────────────────────────┐ Grafana Alert Rule fires (every 1m evaluation) └─────────────────────────────────────┬───────────────────────────┘ │ webhook POST ┌─────────────────────────────────────────────────────────────────┐ OnCallReady (contact point) AI triage → runbook match → execute fix → verify health ✓ Resolved → annotate dashboard + post to Slack ✗ Unresolved → forward to next contact point (PagerDuty/email) └─────────────────────────────────────────────────────────────────┘

Signal → Action table

Grafana alert Runbook triggered Autonomous action
node_filesystem_avail_bytes < 10% Disk Full Remediation Purge logs, temp files; annotate panel with resolution timestamp
node_memory_MemAvailable_bytes < 5% Memory Exhaustion Drop caches, restart leaking service, confirm memory normalizes
container_cpu_usage_seconds_total spike CPU Spike Identify container, throttle or scale, verify CPU drop
kube_pod_status_phase != Running Service Restart & Recovery Describe pod, capture logs, force restart, confirm Running state
probe_ssl_earliest_cert_expiry < 14 days SSL Certificate Renewal ACME renewal, deploy new cert, web server reload
Custom threshold on any Grafana panel Pattern-matched runbook Regex match on alert title → nearest runbook or escalate

Setup in 3 minutes

In Grafana: Alerting → Contact points → Add contact point. Choose Webhook and paste the URL below. Then add it to your notification policy before your existing PagerDuty or email contact.

Grafana contact point config
# Grafana → Alerting → Contact points → New contact point Name: oncallready Type: Webhook URL: https://oncallready.polsia.app/api/alerts # Optional: pass your API key in the header Headers: Authorization: Bearer YOUR_API_KEY # Notification policy (Alerting → Notification policies) # Add oncallready BEFORE your PagerDuty/email contact: - contact_point: oncallready continue: false # set true to always notify next contact too # OnCallReady resolves and annotates the originating panel. # If resolution fails, it passes control to your next policy.

What stays on-call

AI triage handles the majority. These still escalate to a human:

Related

Grafana usually sits on top of Prometheus. Connect both ends of the pipeline:

Your Grafana alerts deserve better than a 3 AM page

Connect in 3 minutes. Free plan, no credit card required.