Skip to content

Allow custom to act as a Fallback Backend #672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bushrat011899
Copy link

Objective

As highlighted in #671, there is room for ergonomic improvements in how getrandom handles unsupported platforms. Since getrandom already has a backend which supports external integration, custom, it would be beneficial to end users to have a way to allow 3rd party dependencies to provide a getrandom backend without modifying RUSTFLAGS.

Solution

  • Added a new feature, first-party-backends-only, which will throw a compiler error if a first party backend is not available for the current target (either due to misconfiguration or lack of support). This is sem-ver compatible with a patch release.
  • Enabled first-party-backends-only by default to preserve the current behaviour, where an unsupported configuration will fail to compile. This is not required for sem-ver compatibility since it would at worst allow compilation where it was already failing. This is done as a conservative guard against a hypothetical supply chain attack.
  • Adjusted the backends.rs cfg_if statement to fall back to custom if no other option is available. Note that custom is still included as the first branch to preserve current behaviour.

Intended Use

This fallback would allow HAL-style crates to include a dependency as a way to provide a fallback backend for getrandom. Since it relies on extern linking, defining multiple fallback backends will cause a compile-time error, as will failing to provide one. Consider the contentious wasm_js backend. In order to work around getrandom's use of RUSTFLAGS to enable that backend, uuid just vendored the backend. This is highly undesirable as an outcome, since there is now an increased risk of uuid containing a bug or vulnerability that is fixed in getrandom but not in their copy.

With this fallback functionality, now rust-random or a 3rd party could publish the wasm_js backend for getrandom as a dedicated crate, allowing users or libraries to activate it as they require. This may be a desirable path forward for more experimental backends anyway, as it would allow iteration without changes to getrandom directly.


Notes

  • Crates will need to be updated to disable default features on getrandom to allow this approach to work, but I don't see many difficulties here since it would be in service of improved ergonomics for their consumers.
  • I think there's room to improve the custom backend to allow overriding the u32 and u64 methods, and potentially providing error messages back to the user, but that's a larger breaking change that isn't required for the ergonomic benefits discussed here.

@newpavlov
Copy link
Member

This looks like a rehash of the custom backend which we had in getrandom v0.2. We intentionally removed it in v0.3. Please read the previous discussion which led to it.

@bushrat011899
Copy link
Author

This looks like a rehash of the custom backend which we had in getrandom v0.2. We intentionally removed it in v0.3. Please read the previous discussion which led to it.

I have read through that issue, and I see this as distinct in that the current custom behaviour is preserved, where it will be chosen over first-party backends when getrandom_backend = "custom", but can be used as a fallback without configuration by end-users without RUSTFLAGS configuration. There is no change in how the custom backend is implemented, it just allows use in the fallback position instead of becoming a compiler error.

Is there a specific concern with this approach beyond its similarity to previous versions of this crate?

@newpavlov
Copy link
Member

newpavlov commented May 23, 2025

Firstly, this feature most certainly should not be enabled by default (it would be practically impossible to disable it). Next, there is a danger of users enabling this feature unconditionally in library crates (e.g. for testing on Web WASM), which could cause cryptic linking errors for downstream users. Finally, enabling an opt-in backend in getrandom would still expose an extern function from a custom implementation crate (e.g. imagine library author using a wasm-bindgen-based custom with downstream users also enabling getrandom_backend="wasm_js"). This is quite possible in practice since IIUC you plan to use it as an alternative for the wasm_js opt-in backend, which also would mean additional confusion for downstream users and ecosystem split between those who use a "custom" crate and those who use rely on the cfg-based opt-in backend.

I guess it could work as one potential option for experimentation. I am interested in hearing @josephlr's opinion on it.

As a minor bikeshedding, I think the feature should be named custom-fallback.

@bushrat011899
Copy link
Author

Firstly, this feature most certainly should not be enabled by default (it would be practically impossible to disable it).

Assuming by "this feature" you mean support for a fallback implementation, then agreed. This PR explicitly has the fallback disabled by default using a default feature first-party-backends-only. With this feature enabled (which is the default), this PR changes nothing in getrandom.

Next, there is a danger of users enabling this feature unconditionally in library crates (e.g. for testing on Web WASM), which could cause cryptic linking errors for downstream users.

Unlike enabling a feature on getrandom (which is easy to do), this requires explicitly adding either your own implementation or an external dependency. Many libraries already handle this correctly (e.g., Bevy) by using a web or js feature. Regardless, this is a concern for other crates in my opinion, not getrandom. For example, nothing stops a library linking two different versions of llvm-sys, which will also cause a linking compiler error.

I appreciate that Rust does have a problem with conflating Wasm with Web Browser, but I don't believe the current RUSTFLAGS approach will lead to a solution in Cargo or rustc. For example, getrandom still has an std feature which is just as destructive as a js feature to compatibility, and even more pervasive. I believe the language team is thoroughly motivated to solve this problem already.

Finally, enabling an opt-in backend in getrandom would still expose an extern function from a custom implementation crate (e.g. imagine library author using a wasm-bindgen-based custom with downstream users also enabling getrandom_backend="wasm_js"). This is quite possible in practice since IIUC you plan to use it as an alternative for the wasm_js opt-in backend, which also would mean additional confusion for downstream users and ecosystem split between those who use a "custom" crate and those who use rely on the cfg-based opt-in backend.

This would be pretty easy to resolve for the wasm_js case by simply adding not(getrandom_backend="wasm_js") within the hypothetical 3rd party Wasm on the browser backend. And crucially would be controlled by the end user via the existing RUSTFLAGS method. If the user adds getrandom_backend="wasm_js", then they will always get the wasm_js backend provided by getrandom, no matter what a library crate does. If they don't add it (or forget to), then a library can provide a fallback backend and allow compilation to succeed.

I guess it could work as one potential option for experimentation. I am interested in hearing @josephlr's opinion on it.

I appreciate you being open to discussion here. I know this is a very important crate you're protecting, and I do want to make it clear that I appreciate you engaging in the discussion.

As a minor bikeshedding, I think the feature should be named custom-fallback.

Happy to call this a custom-fallback, but I'm not quite sure what would be renamed here, unless you're proposing adding an extra feature flag to enable the fallback?

@newpavlov
Copy link
Member

The crate feature should be additive, i.e. enabling the crate future should enable the custom fallback. As I wrote, it will be practically impossible for you to disable the first-party-backends-only feature since it's likely that a dependency in your project's dependency tree will use default features.

This feature then should be enabled by crates which define the "custom" function. Note that it's not sufficient to just add the crate to your dependency tree because of how crates get linked, see the custom backend section docs for more information. In v0.2 we had the register_custom_getrandom! macro to help with that.

Also add `getrandom_no_fallback` escape hatch for security-concerned final binaries.
@bushrat011899
Copy link
Author

The crate feature should be additive, i.e. enabling the crate future should enable the custom fallback. As I wrote, it will be practically impossible for you to disable the first-party-backends-only feature since it's likely that a dependency in your project's dependency tree will use default features.

Ah ok I see what you were referring to now. I agree, I've inverted the feature and renamed it to custom-fallback as requested, and made sure it is not enabled by default. To provide an escape hatch for binaries concerned with a custom fallback being implemented when they do not want one, I have added an extra RUSTFLAG, getrandom_no_fallback. Since it is a RUSTFLAG it is not held to the additive standard that features are, and provides a way for binaries to assert that only their explicitly chosen backend is used.

This feature then should be enabled by crates which define the "custom" function. Note that it's not sufficient to just add the crate to your dependency tree because of how crates get linked, see the custom backend section docs for more information. In v0.2 we had the register_custom_getrandom! macro to help with that.

I believe it actually is sufficient to just include the relevant crate in your dependency graph by explicitly declaring the implementing function as extern "Rust" fn rather than just using #[no_mangle] on its own. See this documentation from critical-section. I could be wrong, but if that's required in end-user code I consider that acceptable, since they will likely have a js/web feature they are enabling anyway.

@dhardy
Copy link
Member

dhardy commented May 27, 2025

To summarize, the main differences from #346 are that with this change:

  • getrandom support for wasm32-unknown-unknown on web would use a custom backend
  • getrandom effectively enables support for a custom backend by default on all unsupported platforms

I see a couple of problems with this approach:

  1. Users will see linker errors. We don't have a good way of fixing this (i.e. providing good lints when users forget to provide a custom backend, or somehow provide two).
  2. Introducing support for currently-unsupported targets (like WASI-P3) will be a behavioural change and potential breaking change (in that a new backend might choose to fail where a custom backend succeeded).

I strongly dislike both points (1) and (2).

@dhardy dhardy closed this May 27, 2025
@dhardy
Copy link
Member

dhardy commented May 27, 2025

Re-opening since point (2) above is not technically a breaking change. I do dislike the implications however: a new backend might choose to fail where it detects some issue while a custom backend may have succeeded (by returning good random data from an alternative source); this would make a new backend a possible run-time-breaking-change.

@dhardy dhardy reopened this May 27, 2025
@bushrat011899
Copy link
Author

Users will see linker errors. We don't have a good way of fixing this

Agreed that linker errors are more user-hostile than compile_error! messages, that is a key trade-off of this PR. In my opinion, this is worth it under the assumption that the quantity of errors seen will be reduced, as certain libraries (like Bevy) will be able to align their own web/js features to include the appropriate backend on behalf of the user.

@dhardy
Copy link
Member

dhardy commented May 29, 2025

In what way is this solution superior to #675?

It is backend-agnostic, but solving the "make getrandom easy to use in the browser" problem requires intermediate crates to take a stance here (not much different from them enabling the wasm_js feature).

It's actually worse, since the getrandom_backend will still allow overriding the backend in #675 but won't with this solution.

@bushrat011899
Copy link
Author

Honestly, I would consider #675 to be superior to this PR. I'd like to see a PR opened based on the suggestion outlined there before closing this one, but I'd support it.

@briansmith
Copy link
Contributor

In #675, @newpavlov wrote:

What do you think about [this idea]? I think we could get quite close to the "just works" ideal with the js-sys crate enabling the custom fallback feature and implementing the extern symbol (probably behind a feature gate). It also should resolve the linking issues with unused crates which we had (since items from js-sys are used in Web WASM projects).

A library crate like ring that provides a "js" feature flag would then have to add a dependency on js-sys conditional on its "js" feature. Any crate that "wrongly" enables ring's "js" feature would then be unable to use the cfg getrandom_backend=custom to override the choice of backend because their custom backend would provide the same __getrandom_v03_custom and then there would be undefined behavior and/or linkage failure, right?. So it would kind of "solve" the problem for getrandom itself by pushing the problem one level up, but the problem would remain for all the users of getrandom, AFAICT.

Previously, js-sys maintainers did not want to do it, but I think #672 is enough for us to say that we did our part.

If that's the case, I'm skeptical that this would be a path to a solution.

@newpavlov
Copy link
Member

newpavlov commented Jun 4, 2025

A library crate like ring that provides a "js" feature flag would then have to add a dependency on js-sys conditional on its "js" feature.

Library crates SHOULD NOT provide such features. We explicitly warn against it. So I believe we can ignore such cases. It's a blatant misuse of the feature. Plain and simple.

there would be undefined behavior and/or linkage failure, right?

This is why in the other discussion I suggested that we should use different extern symbols for this feature. This way you would be able to overwrite the backend with the getrandom_backend configuration flag. You library/app would expose a useless extern symbol in such scenario, but otherwise it should not result in a linking error or UB.

@briansmith
Copy link
Contributor

Library crates SHOULD NOT provide such features. We explicitly warn against it. So I believe we can ignore such cases. It's a blatant misuse of the feature. Plain and simple.

All right, then it's a non-starter for me.

@newpavlov
Copy link
Member

newpavlov commented Jun 4, 2025

Why? In my strong opinion, library crates should not select the fundamental entropy source used by the whole application similarly to how they should not select global allocator. This choice should be done only by the root crate. And if even you is in favor of such misuse, it only cements my opposition to reviving the old js feature.

The js-sys is a bit special in this regard since it provides the "system" layer for Web WASM, i.e. it's effectively a quasi-std.

@briansmith
Copy link
Contributor

Why? In my strong opinion, library crates should not select the fundamental entropy source used by the whole application similarly to how they should not select global allocator. This choice should be done only by the root crate. And if even you is in favor of such misuse, it only cements my opposition to reviving the old js feature.

Because, rightly or wrongly, I already did it, and I've chosen to remain backward-compatible with what I did, if for no other reason than I don't want to deal with the bug reports that would inevitably result from doing things the "best" way. Like I said in the other issue, I like the getrandom 0.3 way but the backward compatibility breakage doesn't work for me, and also I know from experience that too many users won't figure it out. And again, the situation isn't really your fault; the cfg mechanism should be more usable, we shouldn't have had this decade-long history of using feature flags when cfg was a better choice, and we shouldn't even have to deal with this problem in the first place since nobody should even be using wasm32-unknown-unknown/wasm32v1-none since it just forces all these hacks. So it's totally understandable to me if you want to stick with the current design. For myself, I am going to find a way to avoid breaking compatibility with my old mistake.

The js-sys is a bit special in this regard since it provides the "system" layer for Web WASM, i.e. it's effectively a quasi-std.

I don't think people want to rely on designs that are based on providing a global function, because too many things can go wrong, and we don't really know all the ways it can go wrong because we don't have experience doing it. I tolerate the current situation with "custom" grudgingly and I hope we eventually find a better solution for it.

@briansmith
Copy link
Contributor

  • Added a new feature, first-party-backends-only, which will throw a compiler error if a first party backend is not available for the current target (either due to misconfiguration or lack of support). This is sem-ver compatible with a patch release.
  • Enabled first-party-backends-only by default to preserve the current behaviour, where an unsupported configuration will fail to compile. This is not required for sem-ver compatibility since it would at worst allow compilation where it was already failing. This is done as a conservative guard against a hypothetical supply chain attack.
  • Adjusted the backends.rs cfg_if statement to fall back to custom if no other option is available. Note that custom is still included as the first branch to preserve current behaviour.

Basically the above design is a way for people to port getrandom to unsupported targets without contributing the port to getrandom. I much prefer the current design that encourages us to all work together on the ports whenever practical.

The contract that getrandom tries to satisfy is that its output if "cryptographically secure" by relying on the operating system to guarantee that. At least, that's what I expect from getrandom as a user. When not(target_os = "none") we generally should be able to have the backend in getrandom and a "fallback" is counter to expectations.

Currently getrandom doesn't really have a good design for target_os=none targets. Such targets need a CSPRNG to be implemented on top of some entropy sources that the HAL provides, and the entropy source, after the entropy source is properly set up. Ideally we'd have an interface for the HAL to set up the entropy source and for the HAL to provide entropy, and we'd have an interface for a CSPRNG to be plugged in. Maybe in the short term it is worth having a "just do it all for me" plug-in mechanism for target_os=none; note however that the history here as been poor; see the broken ESP-IDF backend.

@newpavlov
Copy link
Member

newpavlov commented Jun 4, 2025

Ideally we'd have an interface for the HAL to set up the entropy source and for the HAL to provide entropy, and we'd have an interface for a CSPRNG to be plugged in.

But it's exactly what this PR does! Yes, we have to use the extern hack because the language does not provide proper tools for that, but the idea is essentially the same. We don't need to plug a separate CSPRNG since it has to be handled by the entropy source itself (similarly to how getrandom on Linux uses CSPRNG internally).

In other words, the "broken" ESP-IDF target could be handled by a hypothetical getrandom-esp-idf crate which enables the fallback feature and exposes the extern function which under the hood uses a CSPRNG (optimized for the target) seeded with esp_fill_random.

@bushrat011899
Copy link
Author

A library crate like ring that provides a "js" feature flag would then have to add a dependency on js-sys conditional on its "js" feature.

Library crates SHOULD NOT provide such features. We explicitly warn against it. So I believe we can ignore such cases. It's a blatant misuse of the feature. Plain and simple.

Personally, I want to push back against this. Fundamentally, I see no difference between a js feature and an std feature.

Tangent that isn't related to this PR

Both are viral, both prevent compilation on certain targets, and both give you horrible to diagnose error messages (serde for example will causes a wall of messages that take about 30 seconds to finish printing if you enable std on a no_std platform). If getrandom has an std feature, and that's the ecosystem standard (it is), why not keep that consistency? If anything, no_std is even harder as crates opt-out of std and need to be actively modified to become no_std. Whereas every crate starts off as no_js and must opt-in via something like wasm-bindgen.

But regardless, this kind of debate is exactly why I think getrandom should have something like this PR to more easily yield the conversation to users and libraries.

Ideally we'd have an interface for the HAL to set up the entropy source and for the HAL to provide entropy, and we'd have an interface for a CSPRNG to be plugged in.

I'd argue that's effectively what this PR does, albeit just by promoting the use of custom (although as others have stated we could use a second extern symbol for the fallback). This is a pretty established design in the Rust ecosystem at this point, with critical-section being the most notable example.

I had an earlier prototype of this PR that was much more controversial which setup a whole trait for providing a backend. If that's something desirable I'd be happy to amend this PR to build up that kind of infrastructure again. A benefit to the trait and macro approach used by critical-section is the trait can have default methods added as a non-breaking change. Currently a getrandom backend must provide arbitrary bytes, and may provide methods for u32 and u64. With a trait, this could be extended to u16, u128, SIMD types, etc. without a major release.

@bushrat011899
Copy link
Author

I've opened #684 as an alternative to this PR. I would encourage comparisons between this and that, and will happily close either based on consensus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants