Write-Ahead Logging: How Databases Survive a Power Cut
How write-ahead logging keeps your data intact when the machine dies mid-write — the log-first rule, fsync, checkpoints, and why PostgreSQL and SQLite both rely on it.
A database commits a transaction, returns OK, and a half-second later someone trips over the power cord. The machine is dead. When it boots back up, the row you just inserted is still there. That is not luck, and it is not magic. It is write-ahead logging doing the one job it exists to do: making a promise survive a crash.
The naive way to store data is to write it straight into the data file at the right offset. The problem is that a single logical change often touches several disk pages — an index entry here, a row there, a free-space map update somewhere else. If the power dies after page one and before page three, you are left with a data file that is internally inconsistent: an index that points at a row that was never written. There is no way to tell, on reboot, whether that file is whole or torn. You have lost the ability to trust your own storage.
The log-first rule
Write-ahead logging fixes this by inverting the order of operations. Before any change is applied to the actual data pages, the database first writes a description of that change to a separate, append-only file: the log. Only after that log record is safely on disk does the database touch the real data — and crucially, it can defer touching the real data for a long time.
The rule is in the name. The log is written ahead of the data. A transaction is considered durable the moment its commit record reaches stable storage in the log, not when the data pages are updated. This is the D in ACID — durability — and the log is where it lives.
The payoff shows up at recovery time. After a crash, the database reads the log from the last known-good checkpoint forward. For every committed transaction whose changes might not have made it into the data files, it replays the log record and reapplies the change. This is the redo pass. For any transaction that was still in flight when the lights went out — a log record with no matching commit — it rolls the change back. This is the undo pass. The canonical formulation of this redo/undo dance is the ARIES algorithm, and most production databases are a variation on its themes.
Why is replaying the log safe when writing the data directly was not? Because the log is append-only and each record is self-contained. You are never half-updating a structure; you are reading a sequence of “this happened, then this happened” entries and applying them in order. Append-only writes are about the only thing storage hardware is genuinely good at keeping consistent.
What this looks like in PostgreSQL and SQLite
The concept is universal, but the two databases most developers actually touch implement it in instructively different ways.
PostgreSQL keeps its WAL as a stream of 16 MB segment files under pg_wal/. Every change generates a WAL record stamped with a Log Sequence Number (LSN), a monotonically increasing position in the log. Periodically the database runs a checkpoint: it flushes all the dirty data pages that the log has been describing out to the main data files, then records that the log up to a certain LSN is now fully reflected on disk. Everything before that point can be recycled. The synchronous_commit setting controls how aggressively commits wait for the WAL flush — turn it off and you trade a window of durability for throughput, which is a legitimate choice for data you can afford to lose.
SQLite ships with WAL mode as an opt-in, switched on with PRAGMA journal_mode=WAL;. By default SQLite uses a rollback journal instead, which works the other way around — it copies the original pages out before overwriting them, so it can put them back on a crash. WAL mode flips this: new changes go to a -wal sidecar file and the main database stays untouched until a checkpoint folds them in. The practical reason to switch is concurrency. In WAL mode, readers do not block the writer and the writer does not block readers, because readers see a consistent snapshot of the main file while new writes pile up in the log. SQLite checkpoints automatically once the WAL file grows past roughly 1000 pages, though you can trigger it yourself.
The shared idea across both: writes are cheap and sequential because they go to the log; the expensive, random-access work of updating the real data structures is batched up and done later, in bulk, when it is convenient.
Cursor
An AI-native code editor that's genuinely useful when you're reading unfamiliar systems code — like a database's WAL implementation — and want to ask 'what does this function do' without leaving the file.
Free tier; Pro from $20/mo
Affiliate link · We earn a commission at no cost to you.
There is a cost to all this, and it is worth naming. Every committed change is written at least twice: once to the log, once to the data file at checkpoint time. This is write amplification, and it is the price of durability. Databases claw some of it back with group commit, batching the fsyncs of several concurrent transactions into a single disk flush, so ten commits arriving at once might cost one physical sync rather than ten. The log is sequential and the batching is generous, which is why the overhead is usually a rounding error against the safety it buys.
The mental model to keep: the log is the source of truth about what happened, and the data files are a cache of where things currently stand that can always be rebuilt by replaying the log. Get that backwards and crash recovery stops making sense. Get it right and the power cord becomes a non-event.
FAQ
Is write-ahead logging the same as a transaction log or a redo log?
Does WAL slow down my writes?
Why does my database still lose data on a power cut sometimes?
Related reading
2026-06-22
TCP vs UDP, Explained Through What Breaks When You Pick Wrong
TCP and UDP aren't interchangeable. We walk through the exact failure modes — head-of-line blocking, silent packet loss, Nagle delays — that show up when you pick the wrong transport.
2026-06-22
Backpressure, Explained Through a Queue That Won't Fall Over
What backpressure actually is, why an unbounded queue is a memory leak in disguise, and the four strategies a producer can take when a consumer falls behind.
2026-06-22
What a Bloom Filter Actually Saves You (and When It Lies)
A bloom filter trades a small false-positive rate for big memory savings. Here is the math behind the trade, where it pays off, and the failure mode that bites people.
2026-06-22
Idempotency, Explained Through the Retry That Doesn't Double-Charge
A practical look at idempotency keys: why a retried payment request shouldn't charge a card twice, how the pattern works, and where it quietly breaks in production.
2026-06-12
Git Plumbing in Practice: How CI, Review Tools, and AI Agents Build on Git's Primitives
How CI runners, stacked-diff CLIs, code review systems, and AI coding agents build on Git's object model — blobs, trees, commits, and refs — instead of reinventing version control, and how to start building on the plumbing yourself.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.