Skip to content

Implemented std.crypto.random.uuid() via std.rand.Random.uuid() #18494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jonathanmarvens
Copy link

I implemented this std.crypto.random.uuid() utility in my local instance of the Zig standard library to generate Version 4 RFC 4122 UUIDs and I think it'd be great to have it in the standard library for everyone.

@jonathanmarvens jonathanmarvens changed the title Implemented std.crypto.random.uuid() () via std.rand.Random.uuid() Implemented std.crypto.random.uuid() via std.rand.Random.uuid() Jan 9, 2024
@Rexicon226
Copy link
Contributor

Should it be called uuid4 instead of uuid since there are other ones?
I know python does something like that: here

@jonathanmarvens
Copy link
Author

@Rexicon226

Should it be called uuid4 instead of uuid since there are other ones? I know python does something like that: here

I don't think that's necessary since random implies version 4, so personally, I think it'd be redundant. That said, I don't have strong feelings either way on this one.

@Rexicon226
Copy link
Contributor

Rexicon226 commented Jan 9, 2024

@Rexicon226

Should it be called uuid4 instead of uuid since there are other ones? I know python does something like that: here

I don't think that's necessary since random implies version 4, so personally, I think it'd be redundant. That said, I don't have strong feelings either way on this one.

I see what you mean, I agree. Other ones could be like example: uuid5 would go in std.crypto.hash.Sha1.uuid, which seems reasonable.

@jedisct1
Copy link
Contributor

jedisct1 commented Jan 9, 2024

UUIDv7 is fully random., and UUIDv6 also includes randomness.

Having the ability to compute UUIDs in the standard library can be useful, but it would be better to have them all in the same namespace. And I'm not sure std.crypto is the right namespace for this, as UUIDs are never used in the context of cryptographic constructions, and people who need to use a UUID rather care about the UUID version than what hash function it is using internally.

Maybe an UUID type can be introduced, so that we can properly support different versions, but also other operations such a parsing UUIDs. Google's UUID package may be a good inspiration for that.

@@ -28,6 +28,25 @@ pub const Sfc64 = @import("rand/Sfc64.zig");
pub const RomuTrio = @import("rand/RomuTrio.zig");
pub const ziggurat = @import("rand/ziggurat.zig");

const uuid_hex_table: [256]*const [2:0]u8 = .{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use std.fmt.bufPrint() and {x:0>2} to format numbers to 0-padded hex strings.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedisct1 What's the code generation like for std.fmt.bufPrint()? I have a C background, so the lookup table is the most straightforward way I thought to implement this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedisct1 Actually, nevermind. I'm going to experiment with it and see for myself. It would certainly be cleaner to use std.fmt.bufPrint() if it generates similar code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lookup table is likely to be faster, but in the context of UUID generation, it shouldn't make much of a practical difference. Improving the formatter code with a specialized code path for {x:0<2} could be a better approach, that would avoid redundancy and benefit to more applications.

@jonathanmarvens
Copy link
Author

jonathanmarvens commented Jan 9, 2024

@jedisct1 Hey! I absolutely love your work. Big fan!

UUIDv7 is fully random., and UUIDv6 also includes randomness.

I'm sorry but I don't really agree with this. UUID versions 6 & 7 are very new and still drafts (AFAIK). For all intents and purposes, until they become approved standards-wise and have general adoption, random—as far as UUIDs are concerned—does, in fact, imply version 4. I'm also not sure I'd consider version 7 fully random anyway, but that's not a hill I'm going to die on right now (I'm also not the expert cryptographer in this discussion—haha).

Having the ability to compute UUIDs in the standard library can be useful, but it would be better to have them all in the same namespace. And I'm not sure std.crypto is the right namespace for this, as UUIDs are never used in the context of cryptographic constructions, and people who need to use a UUID rather care about the UUID version than what hash function it is using internally.

Maybe an UUID type can be introduced, so that we can properly support different versions, but also other operations such a parsing UUIDs. Google's UUID package may be a good inspiration for that.

On the contrary, I think std.crypto.random is actually the perfect place for it to encourage the UUID generation via a CSPRNG. Node.js, e.g., has crypto.randomUUID().

I'd say don't let perfect be the enemy of good here: Yes, we could have a dedicated UUID type to generate all types of UUIDs, but since we don't at the moment, I see no downside to having this baked into the standard library, especially given how common the need for random UUIDs is. Eventually, if the UUID type is implemented, this could be deprecated.

@jedisct1
Copy link
Contributor

jedisct1 commented Jan 9, 2024

The standard library should be future-proof. Breaking changes should be avoided especially when we can anticipate these changes. If we introduce UUIDv4, we already know that we will eventually need to add other versions, and that we will need to do more than just create them. So even if we don't implement all versions right now, it makes sense to already have a dedicated namespace or type, that we can incrementally extend.

Different UUID versions spread across different namespace would be very confusing.

@jedisct1
Copy link
Contributor

jedisct1 commented Jan 9, 2024

By the way, ULID is generally a better option.

@jonathanmarvens
Copy link
Author

jonathanmarvens commented Jan 9, 2024

@jedisct1

By the way, ULID is generally a better option.

You know, I've been meaning to adopt ULIDs over UUIDs since I came across them on Hacker News some time ago, but I'm just so used to using UUIDs, I just default to UUID version 4 every time. Thanks for the nudge!

@Luukdegram
Copy link
Contributor

Note that this was previously rejected in #6618 (comment). I think this will need a compelling argument to be accepted into std.

@jedisct1
Copy link
Contributor

jedisct1 commented Jan 9, 2024

Rather than dropping just a random function here, it would indeed make sense to work on a third-party package that properly implements UUIDs (similar to existing Go, Python, Ruby, etc. packages).

Then, once we have something stable and consistent, we may look at merging it, or parts of it into the standard library. A compelling argument is that it will likely be simple code that's easy to maintain, and that the ability to get UUIDs is a very common need. Usage stats of the 3rd-party package could confirm or infirm that.

@jonathanmarvens
Copy link
Author

@Luukdegram

Note that this was previously rejected in #6618 (comment). I think this will need a compelling argument to be accepted into std.

In my humble opinion, that was the wrong decision. I think the compelling argument is that random UUID generation is a very common need. I very much expect some form of it to be in the standard library of any serious language.

E.g., if you want numbers, the uuid npm package still gets over 40M downloads weekly and easily over 90% of the usage of this library is for import { v4 as uuidV4 } from 'uuid' (this is easily verified via GitHub, by the way); this is all despite the fact that Node.js decided to bake crypto.randomUUID() right into the runtime since version 14 (also due to how common the need is).

Ruby has SecureRandom.uuid(), Node.js has crypto.randomUUID(), Python has uuid.uuid4(), JDK has java.util.UUID.randomUUID(), etc.

This is the type of thing I'd rather not rely on random third-party libraries for.

@andrewrk
Copy link
Member

The way I see it, UUIDv4 takes a set of perfectly fine random bits, encodes them inefficiently, and then decorates them with trash. It certainly does not belong in the Random interface, so I will close this now. I think @jedisct1's analysis of the situation is a reasonable roadmap.

@andrewrk andrewrk closed this Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants