Have you ever wondered why modern databases are so fast and efficient, even when managing terabytes of data?
The answer lies in their underlying data structures and indexing techniques that enable efficient storage, retrieval, and management of data.
In this article, we'll look at 10 important data structures that make modern databases fast, reliable, and scalable.
📣 Design, develop and manage distributed software better (Sponsored)
Multiplayer auto-documents your system, from the high-level logical architecture down to the individual components, APIs, dependencies, and environments. Perfect for teams who want to speed up their workflows and consolidate their technical assets.
1. Hash Indexes
A hash index is a data structure that maps keys to values using a hash function.
The hash function converts a key into an integer, which is used as an index in a hash table (buckets) to locate the corresponding value.
It is designed for fast insertion and lookup, such as:
Insert/Find a new record with
id = 123
.
This structure provides O(1) average-time complexity for insertions, deletions, and lookups.
Hash indexes are widely used in key-value stores (eg., DynamoDB), and caching systems (eg., Redis).
2. B-Trees
A B-tree is a self-balancing tree data structure designed to store sorted data in a way that optimizes reads, writes, and queries on large datasets.
It minimizes disk I/O by storing multiple keys in a single node and automatically balances itself during insertions and deletions.
Unlike binary search trees, where each node has at most two children, B-Trees allow multiple children per node. The number of children is defined by the order of the B-Tree.
Internal nodes contain keys and pointers to child nodes and leaf nodes contain keys and pointers to the actual data.
Keys in each node are stored in sorted order, enabling fast binary searches.
B-Trees are widely used for indexing in relational databases (eg., MySQL).
While many NoSQL databases favor LSM Trees for write-heavy workloads, some use B-Trees for read-heavy scenarios or as part of their indexing strategy.
3. Skip Lists
A skip list is a probabilistic data structure that extends the functionality of linked lists by adding multiple levels of "shortcuts" to enable fast search, insertion, and deletion operations.
It is designed to offer performance comparable to balanced binary trees or B-Trees while being simpler to implement and manage.
A skip list consists of multiple levels, with each level being a subset of the level below.
The bottom-most layer contains all the elements in sorted order, and higher layers are sparser subsets that provide shortcuts to quickly navigate the lower layers.
Nodes are promoted to higher levels probabilistically, ensuring an even distribution without the need for rebalancing.
Skip lists are particularly well-suited for in-memory storage and dynamic datasets where updates are frequent.
Redis uses skip lists to implement it’s sorted sets (
ZSET
), enabling fast insertions, deletions, and range queries while maintaining sorted order.
4. Memtables
A memtable is an in-memory data structure used in modern databases to temporarily store write operations before they are flushed to disk.
It plays a critical role in optimizing write performance and ensuring data durability, especially in databases designed for high-throughput workloads, such as Cassandra, RocksDB, and HBase.
Memtables are typically implemented as a sorted structure like a Red-Black Tree or Skip List, enabling efficient lookups and ordered writes to disk.
When data is written to the database: it is logged in the Write-Ahead Log (to persist the change) and added to the memtable.
For recent writes, the memtable is checked first. If the key is not found in the memtable, the query searches on-disk SSTables (Sorted String Table) or other storage files.
When the memtable reaches its size limit, it is flushed to disk as an immutable SSTable and a new memtable is initialized for subsequent writes.
5. SSTables
An SSTable (Sorted String Table) is an immutable, on-disk data structure used in modern databases like Cassandra, RocksDB, and HBase to store sorted key-value pairs.
SSTables are primarily used in Log-Structured Merge Tree (LSM Tree) databases to optimize read and write performance.
They enable efficient sequential writes, fast lookups, and range queries.
Key Characteristics of SSTables:
Once written, an SSTable is never modified. New writes are handled by creating new SSTables.
SSTables are written as sequential blocks to disk, minimizing fragmentation and improving I/O performance.
SSTables include an in-memory index or Bloom filter to quickly determine whether a key might exist without scanning the entire table.
Older SSTables are periodically merged into larger tables, removing duplicates and reclaiming storage space.
6. Inverted Index
An inverted index is a data structure that maps terms (words or tokens) to the documents or locations where they appear.
It is called "inverted" because it reverses the conventional relationship of an index: instead of mapping documents to the terms they contain, it maps terms to the documents that contain them.
This structure allows for optimal handling of full-text searches like:
Find all document which contain the terms: “database powerful“
How Inverted Index is Created:
Tokenization: Text is split into individual tokens (words or terms).
Example: "Database systems are powerful" → ["database", "systems", "are", "powerful"]
Normalization: Tokens are standardized (e.g., lowercased, stemmed, or lemmatized).
Example: "Databases" → "database"
Index Construction and Storage: For each term, a postings list is created or updated with the document ID and metadata (e.g., term frequency, positions).
Inverted indexes are widely used in databases, search engines, and information retrieval systems to enable efficient keyword lookups, Boolean queries, and relevance ranking.
7. Bloom Filters
A Bloom Filter is a space-efficient, probabilistic data structure that answers the question: "Does this element exist in a set?"
It starts as a bit array of size m, initialized with all bits set to 0
. It also requires k
independent hash functions, each of which maps an element to one of the m
positions in the bit array.
To insert an element into the Bloom filter, you pass it through each of the k
hash functions to get k
positions in the bit array. The bits at these positions are set to 1.
To check if an element is in the set, you again pass it through the k
hash functions to get k
positions.
If all the bits at these positions are set to 1, the element is probably in the set (though there's a chance it might be a false positive).
If any bit at these positions is 0, the element is definitely not in the set.
Unlike traditional data structures, it does not store the actual elements, making it extremely memory-efficient.
Bloom filters allow databases to quickly check if a key might exist in a specific data structure (e.g., an SSTable or a database partition). They avoid unnecessary disk lookups in places where the key is guaranteed to be absent.
8. Bitmap Indexes
A bitmap index is a specialized indexing technique that encodes the values of a column as a series of bitmaps, where each bitmap corresponds to a unique value in the column.
Each bit in the bitmap represents whether a row in the dataset contains that value.
Consider a dataset with a column Color
:
In the bitmap index:
Each row corresponds to a bit in the bitmap.
A
1
indicates that the value is present in the row, while a0
indicates its absence.
Bitmap indexes use bitwise operations (AND, OR, NOT, XOR) to efficiently filter data.
Example Query: Find rows with Color = 'Red' OR Color = 'Green'
Retrieve the bitmap for
'Red'
→1 0 0 1 0
Retrieve the bitmap for
'Green'
→0 0 1 0 1
Perform a bitwise OR →
1 0 1 1 1
Bitmap indexes are widely used in data warehouses, columnar databases, and OLAP (Online Analytical Processing) systems for their ability to speed up complex queries like filtering, aggregations, and joins when dealing with large datasets containing low-cardinality columns (columns with few unique values).
9. R-trees
An R-Tree (short for Rectangle Tree) is a tree-based data structure designed for indexing multidimensional spatial data, such as geographic locations, geometric shapes, or bounding boxes.
It is particularly effective for queries involving spatial relationships, such as intersections, containment, and nearest neighbors.
Each node is represented by an Minimum Bounding Rectangle (MBR) that encloses all its child nodes or objects. Leaf nodes store the actual spatial objects.
PostGIS, an extension of PostgreSQL, uses R-Trees to index spatial data for queries like: Find all locations within a rectangular region.
SELECT * FROM locations
WHERE ST_Intersects(
geometry,
ST_MakeEnvelope(-74.02, 40.70, -73.93, 40.80)
);
10. Write-Ahead Logs (WAL)
A Write-Ahead Log is an append-only persistent log file that records all changes made to a database before they are applied to the actual database.
This ensures that even if the system crashes during a write operation, the database can recover to a consistent state by replaying or rolling back these logged changes.
Structure of a WAL Entry:
Transaction ID (TXID): A unique identifier for the transaction.
Operation Type: The action performed (INSERT, UPDATE, DELETE).
Table Name: The table where the operation occurred.
Key: The primary key or unique identifier of the record.
Old Value: The previous value of the record (used in UPDATE and DELETE).
New Value: The new value of the record (used in INSERT and UPDATE).
Timestamp: The time the operation was logged.
Sample WAL Record:
TXID: 1002
Operation: UPDATE
Table: users
Key: id=1
Old Value: {"id": 1, "name": "Alice", "email": "alice@example.com", "age": 30}
New Value: {"id": 1, "name": "Alice", "email": "alice.new@example.com", "age": 31}
Timestamp: 2024-11-19T10:05:00Z
By logging changes before they are applied to the main database, WAL enables databases to recover from crashes and maintain ACID (Atomicity, Consistency, Isolation, Durability) properties.
Periodically, the database truncates or archives old log entries after ensuring that the changes are safely written to the main database file.
Thank you so much for reading.
If you found it valuable, hit a like ❤️ and consider subscribing for more such content every week.
If you have any questions or suggestions, leave a comment.
Checkout my Youtube channel for more in-depth content.
Follow me on LinkedIn and X to stay updated.
Checkout my GitHub repositories for free interview preparation resources.
I hope you have a lovely day!
See you soon,
Ashish
Very insightful
Please add audio for the blog!