One of the things that make databases truly powerful is their ability to protect your data even in the face of unexpected failures.
Whether the database server crashes, restarts, or there’s a sudden power outage, you can trust that your committed data won’t simply disappear.
This promise is known as Durability — one of the four essential ACID properties of databases.
But, what does it actually take to make a database durable?
In this article, we'll explore three key techniques databases use to ensure durability.
1. Write-Ahead Logging (WAL)
One of the most fundamental techniques databases use to guarantee durability is called Write-Ahead Logging (WAL).
The idea is simple:
Always write changes to a log first, before updating the main data files.
This simple-looking step is powerful:
It gives the database a persistent, chronological record of every change, which can later be replayed to recover the exact state after a crash.
How WAL Ensures Durability (Step-by-Step)
1. Log the Change
Whenever the database needs to perform a change, such as an INSERT, UPDATE, or DELETE, it first creates a detailed record in a sequential append-only log file called the WAL (or sometimes a commit/redo log).
This write is typically done in memory first, making it fast and efficient.
Example WAL record:
LSN: 0/01000050
Record Type: INSERT
Table: public.users
Page: 2003
Tuple ID: (0, 5)
Inserted Values:
id = 101
name = 'Alice'
email = 'alice@example.com'
Transaction ID: 5001
LSN (Log Sequence Number) uniquely identifies the position of the record in the WAL.
Record Type describes the operation (in this case, an INSERT).
Other details specify exactly what changed, where it changed, and under which transaction.
2. Flush the Log to Disk
After creating the WAL record in memory, the database flushes it to durable disk storage.
This is done using system calls like fsync()
, which force the operating system to physically persist the WAL entry, bypassing any memory caching layers.
Only after the WAL record is safely written to disk does the database consider the change durable.
Even if the database server crashes immediately after this step, the change is preserved on disk and can be used for recovery.
3. Acknowledge the Commit
Once the WAL record is durably stored on disk, the database sends a "Success" response back to the client.
At this point:
The transaction is officially considered committed and durable.
The client can safely move on, confident that the change will not be lost.
The main data files (such as tables or indexes) may not have been updated yet, but that is acceptable because the WAL contains everything needed to recover the change.
4. Update in-memory Data Pages
After acknowledging the commit, the database may also update in-memory versions of the affected data pages.
These updated pages, held in buffers, reflect the latest state of the database.
However, they do not need to be immediately flushed to disk, which allows the system to continue performing at high speed.
5. Apply Changes to Data Files (Later)
Writing the actual updated pages to the database’s main data files happens later, typically through background processes like:
Checkpointing
Lazy background flushes
This deferred approach provides important performance benefits:
Multiple updates to the same page can be combined, reducing redundant writes.
Disk writes becomes more sequential and efficient.
Meanwhile, if a crash occurs before the background processes complete, the WAL still contains all the information needed to restore the database to a consistent state.
6. Crash Recovery via WAL
If the database crashes at any point, here's what happens during restart:
The database reads the WAL starting from the last known checkpoint.
It replays any committed transactions that were recorded in the WAL but not yet fully applied to the data files.
It restores the database to a consistent, committed state it was in just before the crash.
Why WAL Also Boosts Performance
WAL doesn’t just make databases more durable, it also boosts performance.
Sequential writes, which simply append to a log file, are much faster than random writes scattered across different data files.
Group commits allow multiple transactions to be flushed together, further reducing disk I/O overhead.
By writing to a fast, sequential log first and delaying slower updates to data files, databases achieve higher throughput without compromising durability.