Ch.8: Let It Crash | Honest Code

Demo: Error recovery state: where does it live?

☠ Try-catch recovery: 6 address spaces

📋

Thread Stack

Stack frames, catch block references, finally cleanup pointers

idle

🗄️

Process Heap

Exception objects allocated, wrapped, re-wrapped at each catch level

idle

⚡

Redis (external key-value store)

Retry counter, circuit breaker state, backoff timer

idle

💾

RDBMS (transaction log)

Partial writes pending rollback, WAL entries

idle

📝

Filesystem (application log)

Swallowed exception written to disk "for later investigation"

idle

📨

Message Broker (dead letter queue)

Failed message requeued with retry metadata and backoff schedule

idle

Event trace

✦ Let it crash: 2 address spaces

⬡

Isolated Process (stack + heap)

All state lives in the process address space. When the process dies, the OS reclaims everything. Nothing to clean up.

running

👁️

Supervisor Process

Monitors child process via OS signals. On exit, spawns a replacement. The entire recovery strategy is one conditional.

watching

Event trace

Address spaces touched

Recovery steps

What you just saw

The left panel is what happens when you try to recover from an error using try-catch. The error originates on the call stack. An exception object is allocated on the heap. The catch block checks Redis for the retry count. It increments the circuit breaker. It attempts to roll back the partial database write. It logs the swallowed exception. It pushes a retry message to the dead letter queue. Six memory locations, touched in sequence, each one a place where recovery state can go wrong, become stale, or be forgotten.

The right panel is the let-it-crash alternative. The process fails. It dies. All its state dies with it; nothing to roll back, nothing to clean up, nothing to log. The supervisor notices the process is gone and starts a fresh one. Two memory locations. Two events. The new process starts from a known-good state because it has never seen the bad input that killed its predecessor.

Every catch block is a bet that you can anticipate the failure mode and write correct recovery logic for it. The more catch blocks you write, the more recovery state you scatter across your infrastructure. The let-it-crash model bets that a fresh start from a known-good state is safer than trying to recover from an unknown-bad state. Thirty years of Erlang in telecom systems proved that bet correct.

How this works: This demo visualizes error recovery complexity, not timing. The try/catch side touches 6 address spaces during recovery (heap, Redis, RDBMS, filesystem, message broker). The let-it-crash side touches 2 (process, supervisor). Counts are architectural facts. Java harness confirms the cost: try/catch recovery is 16.8× slower in nanoseconds. Source: github.com/adamzwasserman/honest-code-traces