Discover ULID: the sortable version of UUID
Photo by Edge2Edge Media on Unsplash
UUID, which stands for the Universally Unique IDentifier, is widely used in the software industry when generating a random identifier. Often, you might need to attach a random identifier when creating an item and still be able through this identifier to sort the item list from the most recently created to the least one or vice-versa.
This is the case when using autoincremented integer identifier, but you can have ID collision if the sequence is corrupted. On the other hand, the randomness of UUID makes it impossible to keep track of which of item creation history; this is where ULID comes in.
ULID stands for Universally Unique Lexicographically Sortable IDentifier. It is a string of 26 characters where the 10 first characters represent the timestamp, and the 16 remainings are generated randomly.
This structure ensures the uniqueness of the string while making it possible to sort them.
There is an implementation for most of the programming languages. You can check out this link to find the library for the programming language you use. We will see how to use it in Javascript using Node.js.
Setup the project
Let's create a Node.js project, configure Typescript and install the ULID Node package:
mkdir node-ulid
yarn init -y
yarn add -D typescript ts-node
yarn tsc --init
yarn add ulid
Let say we have a list of users we want to sort by ID in ascending and descending order.
Create file index.ts and add the code below:
import { ulid } from "ulid";
type User = {
id: string;
name: string;
}
const usersList: User[] = [];
const waitFor = (time: number) => {
return new Promise((resolve) => {
setTimeout(() => resolve(time) , time)
});
}
const insertUsers = async () => {
for (let i = 0; i < 3; i++) {
await waitFor(100); // Simulate latency
usersList.push({ id: ulid(), name: `User ${i + 1}`});
}
}
(async () => {
await insertUsers();
// Sort in ascending order
const sortedUsers = usersList.sort((user1, user2) => user1.id < user2.id ? -1 : 1);
console.log(sortedUsers);
})();
Save and run the file with the command yarn ts-node index.ts
The items are displayed in the order they were inserted, which is normal. Now let's sort in descending order; just replace the sort statement with the one below:
// Sort in descending order
const sortedUsers = usersList.sort((user1, user2) => user1.id > user2.id ? -1 : 1);
Save and run again:
As we can see, there are sorted by the value of the ID.
Random generation at scale
In the previous implementation, we simulated the latency by waiting for 100 milliseconds before creating another user. On a system that handles hundreds of thousands of requests per second, a latency of 100 milliseconds is problematic. Let's assume for each request, we need to generate a ULID; the code will look this:
import { ulid } from "ulid";
type Request = {
id: string;
name: string;
}
const reqsList: Request[] = [];
for (let i = 0; i < 3; i++) {
reqsList.push({ id: ulid(), name: `Request ${i + 1}`});
}
// Sort in descending order
const sortedRequests = reqsList.sort((req1, req2) => req1.id > req2.id ? -1 : 1);
console.log(sortedRequests);
Let's see what happen when we run this code:
The order is not working ?
Keep calm! Let's analyze the value of ULID:
[
{ id: '01FP3BMHAFNKYVY9K03KGK20K3', name: 'Request 2' },
{ id: '01FP3BMHAFJZDW141T1S672R7G', name: 'Request 3' },
{ id: '01FP3BMHAF9MN7VR4AR5XGRQ3S', name: 'Request 1' }
]
If we look at the timestamp section (the 10 first characters), it is the same value for each item: 01FP3BMHAF
. Since the sort is done Lexicographically, the 11th character is used to apply the sorting, which is already the random part. This is why we lost the order.
But why the timestamp generated was the same? It is because the generation of the three items was fast enough to fit in a single millisecond. Does this mean we cannot guarantee uniqueness when the generation happens too fast?
Hopefully, we can! ULID can generate up to 280 unique strings within a millisecond. Here is what the code will look like:
import { monotonicFactory } from "ulid";
type Request = {
id: string;
name: string;
}
const reqsList: Request[] = [];
const ulid = monotonicFactory();
for (let i = 0; i < 3; i++) {
reqsList.push({ id: ulid(), name: `Request ${i + 1}`});
}
// Sort in descending order
const sortedRequests = reqsList.sort((req1, req2) => req1.id > req2.id ? -1 : 1);
console.log(sortedRequests);
We use another monotonicFactory()
that returns the function to use for string generation.
Now it works as expected!
Behind the scenes, if the same millisecond is detected, the random
component is incremented by 1 bit in the least significant bit position (with carrying).
The generation will fail if you exceed the 280 possible generations.
Domain application
You might wonder in what case ULID is useful? Here are some examples:
- Data partitioning and indexing on NoSQL database
- When working with DynamoDB, you can use ULID as a Sort Key to gain randomness and sorting
- On Relational databases, if you don't want to expose auto-incremented ID to the outside and still have to ability to sort and access items through another column.
- On distributed systems, if you want to track the order of a list of events based on their ID.
Additional features of ULID
According to the specification, below are other characteristics of ULID:
- 128-bit compatibility with UUID
- 1.21e+24 unique ULIDs per millisecond
- Lexicographically sortable!
- Canonically encoded as a 26 character string, as opposed to the 36 character UUID
- Uses Crockford's base32 for better efficiency and readability (5 bits per character)
- Case insensitive
- No special characters (URL safe)
- Monotonic sort order (correctly detects and handles the same millisecond)
Wrap up
ULID is an excellent choice to consider if you want to generate a random value that is still sortable.
You can find the code source on the GitHub repository.
Follow me on Twitter or subscribe to my newsletter to not miss the upcoming posts and the tips and tricks I share every week.
Happy to see you soon ?