Emergency Hotfix
Workflow 5 of 5. Something is down in production right now. This is the calm checklist — not the place to improvise. The whole workflow is built around one idea: get production working again as fast as safely possible, then make the fix permanent so it survives the next release.
First, breathe. Most “emergencies” are smaller than they feel in the moment, and a panicked deploy usually makes things worse. The two pages below give you the only two moves you need: fix forward (patch it) or roll back (restore the last good release).
Is this actually an emergency?
Section titled “Is this actually an emergency?”A real emergency is narrow. If it’s not one of these, close this playbook and use Continuous Deploy instead — the normal, staged path is safer.
| TRUE emergency (use this workflow) | NOT an emergency (use Continuous Deploy) |
|---|---|
| Production site is down | A minor UI bug |
| Data is being corrupted or lost | A feature request from a customer |
| A security breach in progress | ”I want this deployed today” |
| Payments failing — users can’t pay | Anything that can wait for staging |
Fix forward, or roll back?
Section titled “Fix forward, or roll back?”Once you’ve confirmed it’s real, there’s exactly one decision: can you fix it in under an hour, or do you need production working now? This flow routes you to the right page.
flowchart TD A{Production is broken.<br/>Is it a TRUE emergency?} -->|No| N[Use Continuous Deploy] A -->|Yes| B{Was it the LAST deploy?} B -->|Yes| R[Roll back:<br/>dep rollback — 30 sec] B -->|No| C{Can you fix it<br/>in under 1 hour?} C -->|Yes| H[Hotfix:<br/>patch and deploy] C -->|No| RThe rule of thumb: if the last deploy broke it, roll back first (it’s a 30-second symlink switch), then fix properly afterward. If the breakage is older or you already know the one-line fix, hotfix forward.
The two pages
Section titled “The two pages”Ready? Start with the fix-forward path — it links straight to rollback if the fix can’t land in time: