Skip to main content

Report sections

Summary

A concise paragraph answering: “What happened?” Read this first to quickly orient yourself and share context with teammates who weren’t on the initial response.

Root cause

The single most likely technical explanation for the incident. Examples:
“A misconfigured circuit breaker in the payment-service deployment (v2.4.1, deployed at 14:32 UTC) caused downstream timeouts to cascade into the checkout-service, resulting in 503 errors for 8% of users.”
This is what you should fix. If you disagree with the root cause, re-run the analysis after adding more context (e.g. linking the correct deployment via Change Intelligence).

Contributing factors

Secondary issues that didn’t cause the incident but made it worse, harder to detect, or harder to fix. These are candidates for follow-up improvements. Examples:
  • “No circuit breaker timeout on the checkout-service calling payment-service
  • “Error rate alert threshold was set too high, delaying detection by 12 minutes”
  • “Deployment lacked a canary stage; 100% of traffic hit the bad version immediately”

Impact

What was affected, for how long, and how severely:
  • Affected services
  • Error rate increase (e.g. “5xx errors increased from 0.1% to 8.3%”)
  • Duration from first alert to resolution
  • Whether the status page was updated

Recommendations

Specific, actionable items ranked by priority. Each recommendation maps to either the root cause fix or a contributing factor improvement.
  • Fix the root cause — concrete steps to resolve the issue
  • Prevent recurrence — structural changes (circuit breakers, better timeouts, canary deployments)
  • Improve detection — alert tuning, runbook links, SLO adjustments

Impact score

A 0–1 numeric score combining severity (50% weight), signal type (30%), and duration (20%). Use it to prioritize postmortems — scores above 0.7 warrant a full blameless postmortem.
Score rangeSuggested action
0.8–1.0Full postmortem, executive summary
0.5–0.79Internal postmortem, action items tracked
0.3–0.49Short incident review, recommendations logged
< 0.3Document in incident timeline; no formal postmortem needed

Acting on the report

  1. Share the summary in your incident Slack channel or status page update.
  2. Assign the root cause fix as an incident action item.
  3. Schedule the contributing factor improvements in your sprint backlog.
  4. Create a postmortem if the impact score warrants it — click Create postmortem from the RCA detail page.