When disk usage breaches threshold, OnCallReady identifies the top space consumers, rotates stale logs, purges temp files, and verifies the volume is healthy — before any human is paged.
Fires on any alert containing disk-related keywords paired with severity indicators. Matches Datadog, CloudWatch, Prometheus, PagerDuty, and custom webhook payloads. Typical triggers: "Disk usage at 92% on prod-web-03", "Volume /data is full", "CRITICAL: Disk capacity exceeded on api-server-1".
Extracts host, volume path, and current usage percentage from the alert payload. Determines which filesystem is affected.
Scans /var/log, /tmp, application cache directories, and container overlay layers. Ranks directories by size to target the highest-impact cleanup actions first.
Compresses log files older than 24 hours using gzip. Removes rotated logs beyond the 7-day retention window. Preserves current active log files.
Clears /tmp directories of files untouched for >6 hours. Removes dangling Docker images and stopped containers if Docker is present. Purges package manager caches.
Re-checks disk usage. If below 80%, marks incident resolved and logs the bytes freed. If above 80%, escalates to on-call with full context and remediation log attached.
Connect OnCallReady to your monitoring tools in minutes. No more 3am disk pages.