Documentation/Bushido/Bushido/ skills /debugging

📖 debugging

Use when investigating bugs, diagnosing issues, or understanding unexpected behavior. Provides systematic approaches to finding root causes.



Overview

Systematic approaches to investigating and diagnosing bugs.

Core Principle

Understand before fixing. A proper diagnosis leads to a proper fix.

The Scientific Method for Debugging

1. Observe

Gather all the facts:

  • What's the symptom? (What's happening that shouldn't?)
  • When does it happen? (Always, sometimes, specific conditions?)
  • Who's affected? (All users, some users, specific scenarios?)
  • Error messages? (Exact text, stack traces, error codes?)
  • Recent changes? (What changed before this started?)

Evidence to collect:

  • Error messages and stack traces
  • Application logs
  • User reports
  • Reproduction steps
  • Environment details (browser, OS, versions)
  • Network requests/responses
  • Database query logs

2. Form Hypothesis

Based on symptoms, what could cause this?

Common categories:

  • Logic error: Code does wrong thing
  • State management: State gets out of sync
  • Async/timing: Race condition, callback hell
  • Data issue: Unexpected input format
  • Integration: API change, service down
  • Environment: Config, permissions, network
  • Resource: Memory leak, connection pool exhausted

Prioritize hypotheses:

  1. Most likely causes first
  2. Easiest to test first (when equal likelihood)
  3. Most impactful if true

3. Test Hypothesis

Design experiment to prove/disprove:

  • Add logging to see values
  • Add breakpoints to pause execution
  • Modify input to isolate variable
  • Disable feature to rule out
  • Compare with working version

Keep notes:

**Hypothesis:** Database query timeout
**Test:** Add query timing logs
**Result:** Query completes in 50ms
**Conclusion:** Not the database ❌

**Hypothesis:** Network latency
**Test:** Check network tab, add timing
**Result:** API call takes 5 seconds
**Conclusion:** Found the issue ✅

4. Analyze Results

What did you learn?

  • Hypothesis confirmed or rejected?
  • New questions raised?
  • Unexpected findings?
  • Root cause identified?

5. Repeat or Conclude

If root cause found:

  • Document findings
  • Estimate impact
  • Plan fix

If not found:

  • Form new hypothesis
  • Repeat cycle

Debugging Strategies

Strategy 1: Add Logging

Most universally useful technique:

// Strategic console.log placement
function processOrder(order) {
  console.log('processOrder START:', { orderId: order.id })

  const items = order.items
  console.log('items:', items.length)

  const validated = validate(items)
  console.log('validation result:', validated)

  if (!validated.success) {
    console.log('validation failed:', validated.errors)
    throw new Error('Invalid order')
  }

  const total = calculateTotal(items)
  console.log('total calculated:', total)

  console.log('processOrder END')
  return total
}

Logging guidelines:

  • Log function entry/exit
  • Log branching decisions
  • Log external calls (API, database)
  • Log unexpected values
  • Include context (IDs, user info)

Strategy 2: Use Debugger

Interactive debugging:

// Browser
function buggyFunction(input) {
  debugger;  // Execution pauses here
  const result = transform(input)
  debugger;  // And here
  return result
}

// Node.js
node --inspect app.js
# Then open chrome://inspect in Chrome

Debugger features:

  • Step over (next line)
  • Step into (into function call)
  • Step out (back to caller)
  • Watch expressions
  • Call stack inspection
  • Variable inspection

Strategy 3: Binary Search

Isolate the problem area:

// 100 lines of code, bug somewhere

// Comment out lines 50-100
// Bug still happens? It's in lines 1-50

// Comment out lines 25-50
// Bug disappears? It's in lines 25-50

// Comment out lines 37-50
// Bug still happens? It's in lines 25-37

// Continue until isolated to specific lines

Strategy 4: Rubber Duck Debugging

Explain the problem out loud:

  1. "This function is supposed to calculate shipping cost"
  2. "It takes the weight and destination"
  3. "First it... wait, it's using price instead of weight!"
  4. (Bug found)

Why this works: Forces you to examine assumptions.

Strategy 5: Compare Working vs Broken

What's different?

Version comparison:

# Find which commit broke it
git bisect start
git bisect bad HEAD
git bisect good v1.0.0
# Git checks out middle commit
npm test
git bisect good/bad
# Repeat until found

Environment comparison:

  • Works locally but not production?
  • Works for some users but not others?
  • Worked yesterday but not today?

What changed?

Strategy 6: Simplify

Reduce to minimal reproduction:

// Complex case with bug
processUserOrderWithDiscountsAndShipping(user, cart, promo, address)

// Simplify inputs one at a time
processUserOrderWithDiscountsAndShipping(user, [], null, null)
// Still breaks? Not discount or address

processUserOrderWithDiscountsAndShipping(null, [], null, null)
// Works now? It's the user object

// What about the user object causes it?

Strategy 7: Check Assumptions

Question everything:

// Assumption: API returns array
const users = await api.getUsers()
users.forEach(...)  // Crashes

// Check assumption
console.log(typeof users)  // "undefined"
console.log(users)          // undefined

// Assumption was wrong!

Common wrong assumptions:

  • Function returns expected type
  • Variable is defined
  • Array is not empty
  • API will always respond
  • Async operation has completed
  • State is up to date

Debugging by Symptom

"Intermittent failure"

Likely causes:

  • Race condition (timing-dependent)
  • Data-dependent (certain inputs trigger it)
  • Resource leak (happens after N operations)
  • External service flakiness

Investigation:

  • Add extensive logging
  • Look for async operations
  • Check timing between operations
  • Look for shared state
  • Run many times to see pattern

"Works locally, fails in production"

Check differences:

  • Environment variables
  • Data (production has different/more data)
  • Network (CORS, SSL, proxies)
  • Dependencies (versions, OS)
  • Resources (memory, connections)

"Slow performance"

Don't guess - profile:

Frontend:

  • Chrome DevTools > Performance tab
  • Look for long tasks (> 50ms)
  • Check for layout thrashing
  • Look for memory leaks

Backend:

  • Add timing logs around operations
  • Check database query time (EXPLAIN ANALYZE)
  • Check external API call time
  • Profile with APM tool

"Memory leak"

Investigation:

// Take heap snapshot
// Do operation that leaks
// Take another heap snapshot
// Compare - what increased?

Common causes:

  • Event listeners not removed
  • Closures holding references
  • Global variables accumulating
  • Intervals not cleared
  • Cache growing unbounded

"Crash/Exception"

Read the stack trace:

Error: Cannot read property 'map' of undefined
    at processUsers (app.js:42:15)
    at handleRequest (app.js:23:3)
    at Server.<anonymous> (server.js:12:5)

Stack trace tells you:

  • Line 42: Where it crashed
  • Line 23: Where it was called from
  • Line 12: Origin of the request

Then:

  • Go to line 42
  • Check what's undefined
  • Trace back why it's undefined

Common Bug Patterns

Null/Undefined

// Bug
function process(user) {
  return user.name.toUpperCase()  // Crashes if user is null
}

// Investigation
console.log('user:', user)  // undefined - why?
// Trace back to where user comes from

Off-by-One

// Bug
for (let i = 0; i <= array.length; i++) {  // <= instead of <
  process(array[i])  // Crashes on last iteration
}

// Investigation
console.log('i:', i, 'length:', array.length)
// Notice i === array.length causes array[i] === undefined

Async Timing

// Bug
let data
fetchData().then(result => {
  data = result
})
console.log(data)  // undefined - async not complete

// Investigation
console.log('1. Before fetch')
fetchData().then(result => {
  console.log('3. Got result')
  data = result
})
console.log('2. After fetch call')
// Output: 1, 2, 3 - async completes later

State Mutation

// Bug
function addItem(cart, item) {
  cart.items.push(item)  // Mutates input!
  return cart
}

const originalCart = { items: [] }
const newCart = addItem(originalCart, item)
// originalCart was modified - unexpected!

// Investigation
console.log('before:', originalCart)
const newCart = addItem(originalCart, item)
console.log('after:', originalCart)  // Changed!

Scope Issues

// Bug
for (var i = 0; i < 3; i++) {
  setTimeout(() => console.log(i), 100)
}
// Prints: 3, 3, 3 (expected 0, 1, 2)

// Investigation
// var is function-scoped, i is shared
// By time timeout fires, loop is done, i === 3

// Fix: Use let (block-scoped) or capture i

Debugging Tools

Browser Developer Tools

Console:

  • console.log() - Print values
  • console.table() - Display arrays/objects as table
  • console.trace() - Print stack trace
  • console.time() / console.timeEnd() - Measure duration

Debugger:

  • Set breakpoints
  • Step through code
  • Inspect variables
  • Watch expressions
  • Call stack

Network:

  • View all requests
  • See request/response headers and bodies
  • Measure timing
  • Replay requests

Performance:

  • Record profile
  • See function call tree
  • Identify bottlenecks
  • Check memory usage

Command Line Tools

# Search for text in files
grep -r "error" logs/

# Follow log file
tail -f logs/app.log

# Search with context
grep -B 5 -A 5 "ERROR" logs/app.log

# Find large files
du -sh *

# Check disk space
df -h

# Check memory
free -m

# Check running processes
ps aux | grep node

Database Debugging

-- PostgreSQL
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';

-- Show slow queries
SELECT * FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;

-- Check table size
SELECT pg_size_pretty(pg_total_relation_size('users'));

-- Check indexes
\d users

Debugging Checklist

Before Starting

  • Can reproduce the issue reliably?
  • Have error message or symptom description?
  • Know when it started happening?
  • Checked if recent changes related?
  • Checked logs for clues?

During Investigation

  • Formed clear hypothesis?
  • Testing hypothesis systematically?
  • Taking notes on findings?
  • Not making random changes hoping to fix?
  • Questioning assumptions?

After Finding Root Cause

  • Understand WHY it happens?
  • Can explain it to someone else?
  • Documented findings?
  • Estimated impact?
  • Identified proper fix?

Anti-Patterns

❌ Random Code Changes

BAD: "Maybe if I change this... nope, try this... nope, try this..."
GOOD: "Hypothesis: X causes Y. Test: Change X. Result: Y still happens.
       Conclusion: X is not the cause."

❌ Assuming Without Verifying

BAD: "The API must be returning valid data"
GOOD: "Let me log the API response to see what it actually returns"

❌ Stopping at Symptoms

BAD: "The page is blank. Fixed by adding a null check."
GOOD: "The page is blank because user is null. User is null because
       authentication token expired. Root cause: token not being refreshed."

❌ Debugging in Production

BAD: "Let me add console.log to production to see..."
GOOD: "Let me reproduce locally and debug there, or use proper logging"

❌ No Reproduction Steps

BAD: "It crashed once, let me guess why"
GOOD: "Let me find reliable way to reproduce it first"

Integration with Other Skills

  • Use proof-of-work skill to document evidence
  • Use test-driven-development skill to add regression test after fix
  • Use explainer skill when explaining bug to others
  • Use boy-scout-rule skill while fixing (improve surrounding code)

Remember

  1. Reproduce first - If you can't reproduce, you can't debug
  2. Gather evidence - Don't guess, look at data
  3. Form hypothesis - What do you think is wrong?
  4. Test systematically - Prove or disprove hypothesis
  5. Find root cause - Not just symptoms
  6. Document - Help future you and others

Debugging is detective work. Be methodical, not random.