ksuid.net
Back

objectid generator

Legacy

Click on an ID to copy.

PropertyValue
Bit Length96
Output Length24 chars
Encodinghex
SortableYes
TimestampedYes
MonotonicYes
Crypto RandomNo

ObjectId is MongoDB's default unique identifier format, designed specifically for the demands of distributed document databases where multiple nodes must generate globally unique identifiers without any central coordination. First introduced as part of the MongoDB database system and its associated BSON (Binary JSON) specification, ObjectId has been a fundamental component of the MongoDB ecosystem since the database's initial release in 2009. Created by the engineering team at 10gen (later renamed MongoDB, Inc.), the format was engineered to be compact, fast to generate, and roughly sortable by creation time, addressing the practical needs of a database that was built from the ground up for horizontal scaling across commodity hardware.

Each ObjectId is a 12-byte (96-bit) value, typically displayed as a 24-character hexadecimal string such as 507f1f77bcf86cd799439011. This makes ObjectId one of the more compact identifier formats in common use, smaller than a 128-bit UUID while still providing sufficient uniqueness for massive-scale deployments. The format's efficiency is not accidental. In a document database where every record carries its own identifier, saving a few bytes per document translates to meaningful storage and bandwidth savings across billions of documents.

ObjectId has transcended its MongoDB origins and is now used in a variety of contexts outside of MongoDB itself. The BSON specification and its associated libraries are available in virtually every major programming language, making it straightforward to generate and parse ObjectIds in applications that have no connection to MongoDB.

How ObjectId Works

The 12-byte ObjectId is divided into three distinct segments, each serving a specific purpose in ensuring global uniqueness across distributed systems.

The first 4 bytes encode a Unix timestamp representing the ObjectId's creation time in seconds since the Unix epoch (January 1, 1970). This timestamp occupies the most significant bytes of the identifier, which means that ObjectIds created at different times will sort in chronological order when compared as binary values or hexadecimal strings. The second-precision granularity means that all ObjectIds generated within the same second share an identical timestamp prefix, with their relative ordering within that second determined by the subsequent bytes. This timestamp can be extracted programmatically, giving developers the ability to determine when a document was created directly from its identifier without querying a separate field.

The next 5 bytes contain a random value that is unique to the machine and process. In the current ObjectId specification (updated in MongoDB 3.4), these bytes are generated once when the ObjectId generator is initialized, typically at process startup, using a cryptographically secure random number generator. This random value serves as a machine and process discriminator, ensuring that different nodes in a distributed cluster produce distinct ObjectId sequences even when their clocks are perfectly synchronized. In earlier versions of the specification, these 5 bytes were split into a 3-byte machine identifier and a 2-byte process ID, but the modern specification consolidates them into a single random block for improved privacy and simplicity.

The final 3 bytes hold an incrementing counter that starts from a random value at process initialization and increments by one with each new ObjectId generated. This counter is the critical component for ensuring uniqueness within a single second on a single process. With 3 bytes (24 bits), the counter can accommodate up to 16,777,216 unique ObjectIds per second per process before overflowing, which provides ample headroom for even extremely high-throughput applications. The combination of timestamp, random value, and counter means that ObjectId generation requires no network communication, no central authority, and no distributed consensus. Each process can independently generate ObjectIds that are guaranteed to be unique within the practical limits of the format. This property is fundamental to MongoDB's distributed architecture, where mongos routers, replica set members, and client applications can all generate valid ObjectIds without coordination.

When displayed, the raw 12 bytes are encoded as a 24-character lowercase hexadecimal string. This encoding is simple and universally supported, though it is less space-efficient than the base62 or base32 encodings used by some newer formats.

Use Cases

MongoDB document identifiers. The most obvious and widespread use of ObjectId is as the _id field in MongoDB documents. Every MongoDB document requires a unique _id, and when one is not explicitly provided, the driver automatically generates an ObjectId. This default behavior means that billions of MongoDB documents across the world use ObjectId as their primary identifier. The embedded timestamp allows developers to query documents by creation time using the _id field alone, and MongoDB's indexes on _id naturally cluster documents in approximate chronological order.

Distributed systems requiring timestamp extraction. ObjectId's embedded timestamp makes it useful in distributed systems where knowing when an identifier was created is valuable but adding a separate timestamp field is undesirable. Log aggregation systems, event collectors, and audit trails can use ObjectId to both identify and temporally locate records. The ability to extract a creation timestamp from the identifier itself eliminates redundant data and simplifies schema design.

Sharded database environments. In horizontally sharded databases, choosing an appropriate shard key is critical for performance. ObjectId's time-based prefix causes new documents to cluster on the same shard (the one responsible for the most recent time range), which can be both an advantage and a drawback depending on the workload pattern. For append-heavy workloads with predominantly recent-time queries, this clustering improves locality. For workloads that need uniform write distribution across shards, ObjectId may need to be combined with other sharding strategies.

Cross-platform identifier generation. Because the BSON library is available in virtually every major programming language, ObjectId can be generated and parsed consistently across heterogeneous technology stacks. This cross-platform availability makes it a practical choice for organizations with polyglot architectures where multiple services need to generate compatible identifiers.

Comparison with Alternatives

Compared to XID, ObjectId and XID share a similar design philosophy of combining a timestamp with machine-specific and counter-based uniqueness. XID is the same size in binary (12 bytes) but encodes to a 20-character base32 string, compared to ObjectId's 24-character hexadecimal representation. XID uses a 4-byte timestamp, 3-byte machine ID, 2-byte process ID, and 3-byte counter, structurally similar to the older ObjectId specification. XID's base32 encoding produces shorter, case-insensitive strings that are more URL-friendly. For applications outside the MongoDB ecosystem, XID is often a more modern choice, while ObjectId remains the natural default within MongoDB.

Compared to KSUID, ObjectId is more compact (12 bytes versus 20 bytes) but provides less random entropy (the 5-byte random value plus 3-byte counter versus KSUID's 128-bit random payload). KSUID uses base62 encoding to produce a 27-character string, while ObjectId's hexadecimal encoding produces a 24-character string despite being smaller in binary. KSUID offers significantly stronger collision resistance within any given time window, while ObjectId's counter-based approach guarantees strict sequential uniqueness within a single process. For MongoDB-centric applications, ObjectId is the obvious choice. For general-purpose distributed systems, KSUID's larger entropy space and more efficient encoding often make it more attractive.

Compared to UUID v4, ObjectId is more compact (12 bytes versus 16 bytes) and provides embedded timestamp information that UUID v4 lacks entirely. UUID v4 offers 122 bits of random entropy, providing stronger statistical collision resistance, but it carries no temporal information and causes index fragmentation in B-tree databases. ObjectId's time-based prefix provides rough chronological sorting, improving query performance for time-based access patterns. UUID v4 benefits from universal standardization, while ObjectId is most commonly associated with the MongoDB ecosystem.

Code Examples

import { ObjectId } from 'bson';
const id = new ObjectId();
console.log(id.toHexString());

Frequently Asked Questions

What is a MongoDB ObjectId?

A MongoDB ObjectId is a 12-byte unique identifier used as the default primary key for documents in MongoDB collections. It consists of a 4-byte timestamp, a 5-byte random value unique to the machine and process, and a 3-byte incrementing counter, ensuring global uniqueness without centralized coordination.

How does ObjectId embed timestamps?

The first 4 bytes of an ObjectId encode a Unix timestamp representing the second at which the ID was created. This means you can extract the creation time of any document directly from its ObjectId without storing a separate timestamp field, which is useful for sorting and auditing.

Is ObjectId globally unique?

Yes, ObjectId is designed to be globally unique across machines, processes, and time. The combination of a timestamp, machine-and-process-specific random bytes, and an incrementing counter makes collisions virtually impossible even in large distributed MongoDB deployments.

How long is an ObjectId?

An ObjectId is 12 bytes in its raw binary form and is typically represented as a 24-character hexadecimal string. This compact representation makes it efficient for storage and indexing while remaining human-readable and easy to pass in URLs and APIs.

Can I use ObjectId outside MongoDB?

Yes, ObjectId can be generated and used independently of MongoDB using libraries available in most programming languages. Its compact size, embedded timestamp, and strong uniqueness guarantees make it a practical choice for any application that needs lightweight, sortable unique identifiers.

Related Generators

© 2024 Carova Labs. All rights reserved