ID choice is architecture. Pick well and your writes scale. Pick poorly and you fight fragmentation, cache misses, and I/O storms
UUIDs, in plain English
What youâll learn: What UUIDs are, why teams use them, common versions, and the tradeoffs that impact databases
What is a UUID?
UUIDs (universally unique identifiers) are 128âbit IDs designed to be unique without a central counter. You often see them as 36âcharacter strings with hyphens, like 550e8400-e29b-41d4-a716-446655440000
Why teams use UUIDs
- No central coordinator for ID generation across services or regions
- Easy merges across shards, services, or data centers
- Safer in APIs than sequential IDs because they are hard to guess
Common versions
- UUID v1 time + node info (node often derived from a MAC address). Leaks time and machine hints
- UUID v4 mostly random. Excellent uniqueness, poor locality for storage engines
- UUID v6/v7 timeâordered layouts that keep inserts mostly sequential
Realâworld example
Twitter introduced 64âbit Snowflake IDs (time + node + sequence) to avoid central contention while keeping order. The same lesson applies widely today
In short, UUIDs remove coordination, but they can hurt write performance if you store them the wrong way
Why MySQL struggles
What youâll learn: How random keys affect MySQL InnoDBâs clustered B+ tree and increase page splits, I/O, and CPU
InnoDB basics
MySQL InnoDB stores rows in a clustered B+ tree (a balanced search tree). The primary key defines the physical row order
- Internal nodes route by key ranges
- Leaf nodes hold sorted rows on pages
- Pages are typically 16 KB in InnoDB
- Sequential keys append to the rightmost leaf, which stays hot in cache
Why random keys hurt
- Scattered inserts target many pages instead of the hot rightmost page
- More page splits and lower page fill, often near 50% under heavy randomness
- Poor cache locality increases I/O and CPU because data is not clustered together in memory
B+ Tree Impact Visualization
Hereâs how different key types affect MySQL InnoDBâs clustered B+ tree:
flowchart TD
subgraph seq[Auto-Increment Keys Sequential]
A1[Root Node Ranges 1-1000 1001-2000 2001+]
A2[Leaf 1-1000 95% Full]
A3[Leaf 1001-2000 95% Full]
A4[Leaf 2001+ Hot Insert Point Cache Hit]
A1 --> A2
A1 --> A3
A1 --> A4
A5[New Row ID 2156] --> A4
end
subgraph rand[UUID v4 Keys Random]
B1[Root Node Ranges Scattered]
B2[Leaf Block A 50% Full]
B3[Leaf Block B 50% Full]
B4[Leaf Block C 50% Full]
B1 --> B2
B1 --> B3
B1 --> B4
B5[New Random UUID] --> B6[Any Leaf Page Split Cache Miss]
B6 --> B2
B6 --> B3
B6 --> B4
end
classDef goodPerf fill:#e8f5e8,stroke:#2e7d32
classDef badPerf fill:#ffebee,stroke:#c62828
classDef newItem fill:#e1f5fe,stroke:#01579b
class A4,A2,A3 goodPerf
class B6,B2,B3,B4 badPerf
class A5,B5 newItem
Key Insight: Sequential keys keep inserts hot in the rightmost leaf (95% page utilization), while random UUIDs scatter across all leaves causing page splits and fragmentation (50% utilization).
Randomness is great for uniqueness and terrible for locality. UUID v4 as a primary key pushes InnoDB toward constant rebalancing and wasted space
Now that you see the cost of random keys, letâs make UUIDs behave better in storage and order
Store and order UUIDs
What youâll learn: How to store UUIDs compactly and keep inserts mostly sequential using ordered variants
Store compactly
- Avoid
CHAR(36)for primary keys - Use
BINARY(16)and convert at the application edges
Example schema and conversions
-- 1) Binary storage
CREATE TABLE users (
id BINARY(16) PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
-- 2) Insert with UUIDv7 from app
INSERT INTO users (id, email) VALUES (UUID_TO_BIN('018f6e6a-4b6a-7c3f-b2d0-9f7c6df4a8a1'), '[email protected]');
-- 3) Read as string when you need it
SELECT BIN_TO_UUID(id) AS id, email FROM users;
If you are on UUID v1
Use UUID_TO_BIN(uuid, 1) to swap the time bits forward for better locality, and read with BIN_TO_UUID(col, 1) to reverse it
-- v1 optimization
INSERT INTO orders (id, total)
VALUES (UUID_TO_BIN('d211ca18-d389-11ee-a506-0242ac120002', 1), 42.00);
SELECT BIN_TO_UUID(id, 1) AS id, total FROM orders;
Prefer ordered generators
- UUID v7 millisecond timestamp prefix plus randomness. Great InnoDB locality with standard UUID format
- UUID v6 similar timeâordered layout using rearranged v1 semantics
Size reality check
| Representation | Bytes | Orderable |
|---|---|---|
| INT autoâinc | 4 | Yes |
| BIGINT autoâinc | 8 | Yes |
| UUID in BINARY(16) | 16 | With v6/v7 |
One change saves space, the other saves your write path. Use compact storage and ordered IDs together
Guideline: Use BINARY(16) plus UUID v7 for MySQL InnoDB. This combination reduces index size and minimizes page splits under write load
ID alternatives
What youâll learn: The tradeoffs among autoâincrement, UUID v4, UUID v7, ULID, Snowflake, and NanoID
ID Scheme Comparison Matrix
flowchart LR
subgraph legend[ID Scheme Performance vs Size]
L1[Excellent Performance Small Size]
L2[Good Performance Medium Size]
L3[Poor Performance Large Size]
end
subgraph schemes[ID Schemes Positioned by Trade-offs]
A[Auto-increment 4-8 bytes Excellent]
S[Snowflake 8 bytes Excellent]
U7[UUID v7 16 bytes Good]
UL[ULID 16 bytes Good]
U4[UUID v4 16 bytes Poor]
N[NanoID 16 bytes Poor]
end
classDef excellent fill:#e8f5e8,stroke:#2e7d32
classDef good fill:#fff3e0,stroke:#ef6c00
classDef poor fill:#ffebee,stroke:#c62828
class A,S excellent
class U7,UL good
class U4,N poor
| Scheme | Size | Orderable | MySQL Performance |
|---|---|---|---|
| Autoâinc INT/BIGINT | 4/8 B | â Yes | âââ Excellent |
| UUID v7 | 16 B | â Yes | ââ Good |
| ULID | 16 B | â Yes | ââ Good |
| Snowflake | 8 B | â Yes | âââ Excellent |
| UUID v4 | 16 B | â No | â Poor |
| NanoID | ~16 B | â No | â Poor |
Notes per scheme
- Autoâinc small and fast, but merging shards is painful and IDs are easy to enumerate
- UUID v4 no coordination and good privacy, but causes fragmentation and index bloat in InnoDB
- UUID v7 timeâordered with great locality, stays at 16 bytes everywhere
- ULID lexicographically sortable base32 string (26 characters). Millisecond buckets can collide under extreme bursty traffic
- Snowflake tiny, sortable, and fast. Requires node IDs and clock discipline
- NanoID short, URLâsafe strings, but no natural order and v4âlike locality issues
Rule of thumb: default to UUID v7 for distributed SQL. Choose Snowflake when you need compact 64âbit IDs and can manage node clocks
Handy snippets
-- Virtual column for readable IDs without storing 36 chars
ALTER TABLE users
ADD COLUMN id_str VARCHAR(36) GENERATED ALWAYS AS (BIN_TO_UUID(id)) VIRTUAL;
# Pseudo Ruby for UUID v7 (use a wellâtested library in production)
def uuid_v7
ts = (Time.now.to_f * 1000).to_i # 48âbit ms
rand_a = SecureRandom.random_number(1 << 12)
rand_b = SecureRandom.random_number(1 << 62)
# pack bits per spec, then hex with hyphens
end
With options in mind, letâs choose a default and outline a safe migration path
Choose and migrate
What youâll learn: Practical defaults for new apps and a stepâbyâstep path to migrate existing tables
Recommended plans
- New singleâDB app use BIGINT autoâinc. Add a unique external ID later if you expose records
- New distributed app use UUID v7 in
BINARY(16). Keep string conversions at the application edge - Existing UUID v4 tables convert to
BINARY(16)first. Put new tables on UUID v7 and keep old tables stable - If stuck on UUID v1 insert with
UUID_TO_BIN(uuid, 1)and exposeBIN_TO_UUID(col, 1)for reads - Secondary indexes audit each one and drop dead weight. Remember every secondary index stores the primary key again
Minimal migration example
-- Step 1: add binary column alongside old char PK
ALTER TABLE events ADD COLUMN id_bin BINARY(16) NULL;
-- Step 2: backfill in chunks
UPDATE events SET id_bin = UUID_TO_BIN(id_char) WHERE id_bin IS NULL LIMIT 10000;
-- Step 3: switch PK (requires downtime or orchestrated swap)
ALTER TABLE events DROP PRIMARY KEY, ADD PRIMARY KEY (id_bin);
-- Step 4: optional remove old char column after apps switch
ALTER TABLE events DROP COLUMN id_char;
Quick checklist
- Use
BINARY(16) - Prefer UUID v7 or v6
- Prune secondary indexes
- Keep IDs opaque at the API boundary
TL;DR: For MySQL InnoDB, use UUID v7 in BINARY(16), or BIGINT autoâinc if you do not need distributed generation. Avoid CHAR(36) and random UUID v4 as primary keys
Related Posts
n8n + Supabase/Postgres: Upserts, Rate Limits, Webhooks
A pragmatic guide to n8n with Supabase/Postgres: bulk upserts with ON CONFLICT, rateâlimitâsafe batching, dedupe keys, ordered ULID/UUID v7 IDs, and webhook streaming.
What Is Multitenancy? Rails Examples for SaaS
Clear definition of multitenancy vs user management, two patterns (row-level vs database-per-tenant), plus Rails examples using a SchoolâCourseâLesson ERD and copyâpaste client answers.