How to Generate Unique IDs in Distributed Systems

7 Popular Approaches

Nov 14, 2024

∙ Paid

Any distributed system that operates at scale often relies on unique ids.

For example, consider order tracking in e-commerce: each order placed by a customer is assigned a unique ID, allowing the system to track it through every stage—order processing, payment, shipping, and delivery.

But how do we generate these IDs in a way that’s fast, unique, reliable, and scalable?

In this article we’ll dive into 7 popular approaches to generate unique ids in distributed systems.

1. UUID (Universally Unique Identifier)

UUIDs, also known as GUIDs (Globally Unique Identifiers) are 128-bit numbers widely used for generating unique identifiers across distributed systems due to their simplicity and lack of dependency on a centralized source.

In this setup, each server can generate unique IDs independently.

UUIDs come in multiple versions:

UUID v1 (Time-Based): Uses timestamp and machine-specific information like the MAC address.
UUID v3 (Name-Based with MD5): Generated by hashing a namespace and name using MD5.
UUID v4 (Random): Uses random values for most bits, providing a high degree of uniqueness.
UUID v5 (Name-Based with SHA-1): Similar to v3 but uses SHA-1 hashing for stronger uniqueness.

The most commonly used version is UUID v4.

Format (UUID 4)

Example: 550e8400-e29b-41d4-a716-446655440000

Randomness (122 bits): Most of the UUID is composed of random hexadecimal digit (0–9 or a–f).
Version (4 bits): The third block’s first character is always 4, identifying it as a version 4 UUID.
Variant (2-3 bits): Located in the fourth block, it’s either 8, 9, a, or b. It represents the variant and ensures that UUID follows the RFC 4122 standard.

Code Example (Python)

import uuid

# Generate a random UUID (version 4)
uuid_v4 = uuid.uuid4()
print(f"Generated UUID v4: {uuid_v4}")

Pros:

Decentralized: UUIDs can be generated independently across servers.
Collision Resistance: With 128 bits, UUID v4 has a collision probability so low it’s practically negligible.
To visualize: Even if every person on Earth generated 1 million UUIDs per second, it would take over 100 years to have a 50% chance of a single collision.
Ease of Implementation: Most programming languages provide built-in libraries for generating UUIDs.

Cons:

Large Size: UUIDs consume 128 bits, which can be excessive for some storage-sensitive systems.
Not Sequential: UUIDs lack order, meaning they don’t play well with indexing systems like B-Trees.

UUIDs are ideal when you need globally unique IDs across distributed systems without central coordination and when order isn’t important (e.g., Order IDs in E-commerce, Session IDs for User Authentication).

2. Database Auto-Increment

Continue reading this post for free, courtesy of Ashish Pratap Singh.

Or purchase a paid subscription.