Overview of globally unique identifiers

Written by Matt Tolman

Published: Dec. 2, 2023

Estimated reading time: 4 min read

Recently I saw that CUID[1], a way to generate unique IDs, had been deprecated due to security concerns. That got me really curious. I haven't ever thought of IDs as being "secure." Instead, I've always treated them as untrusted, public-knowledge values. I've always built access controls around my IDs, even when they were randomly generated GUIDs. So I started looking at a lot of ID generators to see if this concept of "secure ids" really did exist, what meant, and what other properties IDs had.

ID Properties

Before we get too much into the IDs themselves, it is useful to talk about the different properties IDs can have. I'll focus on just three properties: Security, monotonically increasing, and temporal.

Security

There are two main aspects I found for judging ID security:

How easy is it to predict other ids
How much information is leaked by the id

Keep in mind that this doesn't mean ids with low security are bad or that they shouldn't be used in production. In fact, many secure apps do use IDs with "low" security. Instead, consider the term "ID Security" as more of a rhetorical device which makes some IDs seem more superior than they are. "Secure" IDs don't make an application secure. Likewise, "insecure" IDs don't make an application insecure. Rather, taking a wholistic approach to security is what makes applications secure. This includes proper access controls, using proper authentication mechanisms, and not trusting user input. A "secure" ID can mask underlying security issues, but it does not fix them.

Monotonically Increasing

A monotonically increasing ID is one where IDs can be strictly orderd by the order they were created.

For instance, the following IDs are monotonically increasing:


1 2 3 4 5 6

So are the following IDs:


1234-abcd 1234-abce 1235-aaaa 1235-aaab

All the above IDs can be sorted to get the same creation order. There are a lot of great benefits to monotonically increasing IDs. They allow programs to reason about the order of events, so a program can know if event 1 or event 2 happened first. For event queues and logs, this is great. They also provide a useful default sorting algorithm. If a user wants to see newest or oldest items first (e.g. like in a timeline), then the database only needs to sort on the ID, which is almost always indexed, even in terribly designed databases.

The downside is that these types of IDs are easy to predict. Also, they can sometimes require extra coordination and synchronization (though that is less rare these days).

Temporal

Temporal IDs are closely related to monotonically increasing IDs. The difference is that they have a timestamp embedded into them - usually in the highest digits of the ID. They aren't guaranteed to be in creation order (usually when multiple IDs are created simultaneously), but they do often require less coordination.

The biggest benefit to temporal IDs is that they allow searching and sorting based on time of creation just using the ID field. For instance, if your ID is of the format YYYYMMDD-xxxx (where `x` is random), then you can find any IDs created in December 2020 with the query:


SELECT * FROM mytable WHERE id >= '20201201-0000' AND id <= '20201231-9999'

The downside is that these IDs are also generally pretty easy to predict. They also aren't strict on preserving creation order, so there is some fuzziness there - especially as a system scales in the number of records generated. This means that temporal IDs have a looser ordering than monotonically increasing IDs. For most use cases it's fine, though sometimes it can matter.

Categories of IDs

Bibliography

[1] “CUID” GitHub. https://github.com/paralleldrive/cuid (accessed Nov. 14, 2023).
[2] “How to ensure entropy and proper random numbers generation in virtual machines” Exoscale Blog. https://www.exoscale.com/syslog/random-numbers-generation-in-virtual-machines/ (accessed Nov. 30, 2023).
[3] J. Bayer, “Challenges With Randomness In Multi-tenant Linux Container Platforms” vmware Tanzu Blog. https://tanzu.vmware.com/content/blog/challenges-with-randomness-in-multi-tenant-linux-container-platforms (accessed Nov. 30, 2023).
[4] J. Liebow-Feeser, “Randomness 101: LavaRand in Production” The Cloudflare Blog. https://blog.cloudflare.com/randomness-101-lavarand-in-production/ (accessed Nov. 30, 2023).
[5] “Commit task: reference .NET implementation of cuid” GitHub. https://github.com/paralleldrive/cuid/commit/c107cfe746639ec47cb73a46934bd64218430971 (accessed Dec. 1, 2023).
[6] “CUID - Commit: Update README” GitHub. https://github.com/paralleldrive/cuid/commit/e15b0679a2f3d9444021e584f8cfffbc7ee54ce8 (accessed Dec. 1, 2023).
[7] “CUID2 - Commit: Initial Commit” GitHub. https://github.com/paralleldrive/cuid2/commit/39e8e107b31249d8f1f06388816861ca6c3e0f66 (accessed Dec. 02, 2023).