ksuid.net
Back

xid generator

Situational

Click on an ID to copy.

PropertyValue
Bit Length96
Output Length20 chars
Encodingbase32hex
SortableYes
TimestampedYes
MonotonicYes
Crypto RandomNo

XID is a globally unique, sortable identifier format created by Olivier Poitrey, designed to produce compact, 20-character strings that work consistently across programming languages and distributed systems. Inspired by MongoDB's ObjectId, XID distills the same structural philosophy into a leaner 12-byte value encoded with base32hex. The format was originally implemented in Go and has since been ported to JavaScript, Python, Java, Ruby, Rust, and numerous other languages, making it one of the most portable identifier formats available. XID was built to solve a specific problem: generating unique identifiers in distributed systems without coordination between nodes, while keeping the output as short as possible. At 20 characters, an XID is shorter than a UUID (36 characters), a ULID (26 characters), or even a KSUID (27 characters), making it particularly attractive for systems where identifier length directly impacts storage costs, URL readability, or bandwidth consumption.

Poitrey developed XID while working on infrastructure that needed to assign unique identifiers across multiple services running on different machines and processes. Rather than relying on a centralized ID allocation service, XID encodes enough contextual information about the generating environment to prevent collisions without any coordination. The machine identifier, process identifier, and atomic counter embedded in each XID ensure that two different generators will never produce the same value, even if they happen to generate identifiers at the exact same second. This design philosophy aligns closely with ObjectId but applies it in a language-agnostic, general-purpose package that is not tied to any specific database system.

How XID Works

Each XID is a 12-byte (96-bit) binary value composed of four distinct fields, arranged so that the most significant bytes represent time, enabling natural chronological sorting.

The first 4 bytes contain a Unix timestamp measured in seconds since January 1, 1970. This 32-bit timestamp provides coverage until February 7, 2106, which is the overflow date for unsigned 32-bit Unix time. Because the timestamp occupies the leading bytes, XIDs generated at different times will sort in chronological order when compared as byte arrays or as their base32hex string representations. The second-level granularity is coarser than the millisecond timestamps found in ULID or UUID v7, but it is sufficient for most application workloads where sub-second ordering is not critical.

The next 3 bytes contain a machine identifier derived by hashing the system's hostname using MD5 and taking the first three bytes of the result. This provides 16.7 million possible machine values, sufficient to distinguish nodes in any practical distributed deployment. The machine identifier is computed once at process startup and remains constant for the lifetime of the process.

The following 2 bytes contain a process identifier, typically derived from the operating system's process ID (PID). This field distinguishes between multiple XID generators running on the same machine, such as separate worker processes spawned by a process manager. Combined with the machine identifier, these 5 bytes of contextual information create a unique namespace for each generator instance.

The final 3 bytes contain an atomic counter that is initialized to a random value at process startup and incremented by one for each new XID generated. This counter allows up to 16,777,216 unique identifiers per second per process before overflowing. The random initialization prevents different processes that happen to start at the same second from producing identical counter sequences.

The combined 12-byte value is encoded using base32hex (RFC 4648), which maps each 5-bit group to a character from the alphabet 0-9a-v. This encoding is case-insensitive, URL-safe, and requires no special characters or padding. The result is a fixed-length, 20-character string that can be safely used in URLs, filenames, HTTP headers, and command-line arguments without escaping. The base32hex alphabet also preserves lexicographic sort order, meaning that string comparison of XID values produces the same result as byte-level comparison of the underlying binary data.

The generation process involves no network calls, no disk I/O, and no locking beyond an atomic increment on the counter. This makes XID generation extremely fast. The deterministic structure also means that the timestamp, machine ID, process ID, and counter can be extracted from any XID by reversing the encoding.

Use Cases

Microservice request tracing. In distributed architectures where requests flow through multiple services, each hop can generate an XID to tag its processing step. The 20-character format is compact enough to include in HTTP headers, log lines, and trace spans without significant overhead. The embedded timestamp allows operators to reconstruct the approximate timeline of a request by inspecting the trace IDs. The machine and process identifiers provide additional diagnostic context, revealing which instance handled each step.

Database primary keys in space-constrained environments. When storage efficiency is a priority, XID's 12-byte binary representation is smaller than UUID's 16 bytes or KSUID's 20 bytes. In tables with hundreds of millions of rows, the savings in index size translate to measurable improvements in query performance and memory usage. The time-sorted property ensures that inserts are sequential, reducing B-tree page splits. For systems that store identifiers as strings, the 20-character XID is nearly half the length of a 36-character UUID.

Log aggregation and event tagging. Systems that ingest high volumes of log entries benefit from XID's combination of uniqueness and compactness. Each log entry tagged with an XID can be deduplicated, sorted chronologically, and correlated across services without ambiguity. The fixed 20-character length simplifies log parsing rules and keeps log line lengths predictable, which matters for systems with per-line size limits or fixed-width storage formats.

Cross-language distributed systems. XID's broad language support makes it an excellent choice for polyglot architectures where different services are written in different languages. A Go service, a Python worker, and a Java API can all generate and parse XIDs using native libraries that produce identical output for the same inputs.

Comparison with Alternatives

XID and ObjectId share the same structural heritage: both use a 4-byte timestamp, a machine or random identifier, a process-related field, and a counter. The key differences are encoding and ecosystem coupling. ObjectId uses hexadecimal encoding, producing a 24-character string, while XID uses base32hex for a 20-character string. ObjectId is tightly integrated with MongoDB and BSON, whereas XID is a standalone format with no database dependency. For MongoDB applications, ObjectId is the natural choice. For general-purpose distributed systems, XID is more portable.

Compared to KSUID, XID trades randomness for contextual identifiers. KSUID uses a 4-byte timestamp followed by 16 bytes of cryptographic randomness, producing a 27-character Base62 string. This makes KSUIDs larger but eliminates any dependency on machine or process identity, which is advantageous in containerized environments where hostnames and PIDs are recycled frequently. Teams running on ephemeral infrastructure may prefer KSUID's random approach, while teams on stable, long-running infrastructure will benefit from XID's smaller footprint.

ULID offers a 128-bit identifier with millisecond timestamp precision and 80 bits of cryptographic randomness, encoded as a 26-character Crockford Base32 string. ULID provides finer time resolution and a larger random space than XID, at the cost of a longer output. ULID also lacks the machine and process identifiers that XID embeds, relying entirely on randomness for uniqueness. For applications that need sub-second ordering or maximum collision resistance, ULID is the stronger choice. For applications prioritizing minimal identifier length with stable infrastructure, XID offers a more compact solution.

Code Examples

import Xid from 'xid-js';
const id = Xid();
console.log(id);

Frequently Asked Questions

What is XID?

XID is a globally unique, sortable ID generator that produces compact 12-byte (96-bit) identifiers. It encodes a 4-byte timestamp, a 5-byte machine/process identifier, and a 3-byte incrementing counter into a 20-character base32 string, offering a balance between compactness and uniqueness across distributed systems.

How does XID compare to MongoDB ObjectId?

XID is directly inspired by MongoDB ObjectId and shares a very similar structure: both use 12 bytes containing a timestamp, machine identifier, and counter. The key difference is that XID uses a more compact base32hex encoding (20 characters) compared to ObjectId hex encoding (24 characters), and XID is designed as a standalone library usable outside of MongoDB.

Is XID sortable?

Yes, XIDs are naturally sortable by creation time. The first 4 bytes store a Unix timestamp with second-level precision, so XIDs generated later will sort after those generated earlier. Within the same second, the incrementing counter ensures a consistent ordering across IDs from the same process.

How long is an XID string?

An XID string is exactly 20 characters long, encoded using base32hex. This makes it one of the most compact unique identifier formats available, significantly shorter than UUIDs (36 characters), ULIDs (26 characters), and KSUIDs (27 characters) while still providing strong uniqueness guarantees.

What languages support XID?

XID was originally implemented in Go and has since been ported to many popular languages, including JavaScript/TypeScript, Python, Ruby, Java, Rust, C#, and Elixir. The Go implementation is the reference library, and community-maintained ports ensure consistent ID generation and parsing across polyglot architectures.

Related Generators

© 2024 Carova Labs. All rights reserved