I've seen bug reports that say 'app crashes when user clicks button.' That's not documentation — it's a smoke signal. After a particularly nasty incident where a payment processing bug recurred three times in six months because no one wrote down the actual cause, I became obsessed with bug documentation.
This post covers a specific method I've used across teams that cut bug recurrence by 60% in 12 months. It's not about writing more — it's about writing the right things in a structured way.
The Incident That Changed My Approach
Payment Double-Charge Bug — Three Times
- 2023-03-14 09:23Customer support gets first report of double charges on subscriptions.
- 2023-03-14 11:45Dev finds a race condition in the payment callback handler. Hotfix deployed.
- 2023-03-14 12:00Bug doc created: 'Fixed race condition in payment callback causing duplicate charges.' No reproduction steps.
- 2023-06-02 14:10Same symptom reported again. New engineer can't reproduce because doc lacks environment details.
- 2023-06-02 17:30Root cause found: same race condition, different code path. Fix applied.
- 2023-09-18 08:05Third occurrence. This time we had enough context from previous docs to identify the pattern in 30 minutes.
Lesson
The first two docs were useless — they described the fix but not the environment, exact payloads, or reproduction steps. The third doc forced us to include curl commands, database state, and timing windows. That's what finally broke the cycle.
After the third recurrence, I spent a week redesigning our bug documentation template. The goal: make it impossible to close a bug without answering five specific questions. Here's what I landed on.
The Five-Question Template
- 1What is the exact symptom? (Include error messages, screenshots, logs.)
- 2What is the environment? (OS, browser version, API version, database state, relevant config.)
- 3What are the exact steps to reproduce? (Numbered, with specific inputs.)
- 4What is the root cause? (Code section, data state, external dependency.)
- 5What is the fix and how was it verified? (Link to PR, test output, rollback plan.)
We enforce this template via a GitHub issue template with required fields. If any field is missing, the issue cannot be marked as 'resolved.' It sounds draconian, but it works.
A Concrete Example
title: Payment double-charge on concurrent subscription upgrades
symptom: User receives two charge emails and sees two transactions in Stripe.
Logs show two identical 'payment.success' events within 500ms.
environment:
- OS: Ubuntu 22.04 (server)
- Node: 18.17.0
- Stripe API: 2023-08-16
- Database: PostgreSQL 15 with default isolation level (READ COMMITTED)
reproduction:
1. Create a user with an active subscription (id: 123).
2. Send two simultaneous POST /upgrade requests (use curl --parallel).
3. Observe two charge.succeeded webhooks.
4. Check database: two rows in payments table for same invoice.
root_cause: The upgrade handler checks `if !user.subscription.active` before charging,
but the check and charge are not atomic. Under READ COMMITTED, both requests
see the subscription as active and proceed to charge.
fix: Wrap the check+charge in a SELECT ... FOR UPDATE on the subscription row.
PR: https://github.com/org/repo/pull/1234
verification:
- Unit test added for concurrent upgrades (using lock timeout).
- Staging test: 10 parallel requests, only 1 charge.
- Rollback: revert to previous code and restore database from backup.The reproduction steps should include exact commands, not just 'send a request.' I always paste the curl command or the API call that triggers the bug. It saves hours of guesswork.
Beyond the Template: What We Learned About Recurrence
Reduction in bug recurrence after implementing structured docs
Faster mean-time-to-resolution for bugs with complete docs
The numbers come from our internal tracking over 12 months. We compared the 6 months before the template (22 recurring bugs out of 80 total) to the 6 months after (8 out of 75). The template forced us to capture environment details that were previously assumed 'known.'
But the biggest win wasn't the numbers — it was onboarding. New engineers could pick up a documented bug and start reproducing it in minutes, not hours. The docs became a knowledge base of system edge cases.
What We Still Get Wrong
- arrow_rightWriting root cause as 'race condition' without specifying the exact window or data state.
- arrow_rightForgetting to include the fix impact — did this change affect performance? Did it introduce new failure modes?
- arrow_rightNot linking related bugs. We now require a 'related issues' section in every doc.
I still see engineers write 'fixed a bug where X happens' and call it done. That's not documentation — that's a tombstone. Good bug docs are tools for the future you, or for the engineer who joins next month and needs to understand why that code path exists.
If you take one thing from this post: next time you fix a bug, spend 10 extra minutes writing down the reproduction steps as if you were handing it to a stranger. It will pay for itself the first time that bug tries to come back.
Frequently asked questions
How detailed should a bug report be?
Detailed enough that a teammate unfamiliar with the feature can reproduce it in under 10 minutes. Include OS, browser version, API payloads, log snippets, and exact steps. Avoid vague phrases like 'sometimes it fails'.
What's the difference between a bug report and a postmortem?
A bug report captures the initial finding and reproduction. A postmortem is written after resolution and covers root cause, impact, timeline, and action items. Both should be linked together.
How do I get my team to actually write good bug docs?
Make it a required step in the definition of done for any bug fix. Use a shared template in your issue tracker and review docs in sprint retrospectives. Celebrate well-documented incidents.
What should I do if I can't reproduce a bug?
Document all attempted reproduction steps, environment details, and any logs available. Sometimes the bug is intermittent – note frequency and conditions. A well-documented non-reproducible bug is still valuable.