Skip to content

Conversation

john-moffett
Copy link
Contributor

Replace memset which can still be optimized out as secnonce isn't read later in this function. The API makes it clear the callee is responsible for it, so we need to assure it's cleared properly.

@real-or-random
Copy link
Contributor

Replace memset which can still be optimized out as secnonce isn't read later in this function.

Why do you think it can be optimized out? I think it can be optimized out if the compiler can prove that no other code will read it. That seems unlikely, as the call is an API call. But if the compiler can prove this, then it's indeed fine to optimize it out? Or are you saying we should make absolutely sure that it's cleared out, even if the caller won't read it? Perhaps.

@real-or-random
Copy link
Contributor

cc @jonasnick who wrote this module

Copy link
Contributor

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @john-moffett. I think it would be an improvement to use secp256k1_memclear in this case.

But if the compiler can prove this, then it's indeed fine to optimize it out?

I think in this case it still makes sense to clear it to try to avoid having secret data in memory that we don't use anymore. This is also the argument for clearing local secret variables at the end of functions.

@john-moffett john-moffett force-pushed the musig-partial-clear-nonce branch from 10c46ec to fb40a68 Compare September 5, 2025 14:05
@john-moffett
Copy link
Contributor Author

Why do you think it can be optimized out? I think it can be optimized out if the compiler can prove that no other code will read it. That seems unlikely, as the call is an API call.

I agree that it's unlikely to be optimized out, and I confirmed on my setup (arm64 macOS Clang 17.0.0) that even with -O3 and full LTO that it wasn't. However, that's not guaranteed since it's possible a different compiler (or later version) can prove no later reads, so I figured it's just best practice to use a wiper much less likely to be DSE'd.

But if the compiler can prove this, then it's indeed fine to optimize it out? Or are you saying we should make absolutely sure that it's cleared out, even if the caller won't read it?

The latter. If they assumed the callee wiped the memory (as is promised in the API comment), they might not bother themselves and it could be left in caller memory.

@john-moffett john-moffett force-pushed the musig-partial-clear-nonce branch from fb40a68 to 312ff27 Compare September 5, 2025 17:10
@john-moffett john-moffett force-pushed the musig-partial-clear-nonce branch 2 times, most recently from 8b771f8 to 54bd42d Compare September 5, 2025 17:52
@jonasnick
Copy link
Contributor

@john-moffett Can you remove the remaining declassify call as well?

@john-moffett john-moffett marked this pull request as draft September 5, 2025 19:13
@john-moffett john-moffett force-pushed the musig-partial-clear-nonce branch from 54bd42d to 16f62f5 Compare September 6, 2025 13:46
@john-moffett john-moffett force-pushed the musig-partial-clear-nonce branch from 16f62f5 to d872a08 Compare September 7, 2025 14:10
@john-moffett john-moffett marked this pull request as ready for review September 7, 2025 14:11
Copy link
Contributor

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d872a08

@real-or-random
Copy link
Contributor

real-or-random commented Sep 8, 2025

I agree it's better to prevent the compiler from optimizing out that store.

I'm not sure if this PR adds another code smell. I know it's me who suggested secp256k1_memclear above. In the rest of the code base, we use secp256k1_memclear to nuke memory contents. This is true here as well, but here we want the additional guarantee that the memory is overwritten by zero bytes. That fits the current implementation but not the documented contract of secp256k1_memclear. An implementation that overwrites everything with 0x42 is equally correct. We simply picked 0x00 because it appears to be faster. I doubt that will change in the future, but I still think we should avoid the contract violation.

What could we do instead?

We could use secp256k1_memclear but rename secp256k1_memclear to secp256k1_memzero and change its docs to guarantee it writes zeros. There's nothing wrong with adding that guarantee, but I don't think the result will be much cleaner if we only do this. We use SECP256K1_CHECKMEM_UNDEFINE inside this function for a reason. If the goal is to nuke a buffer, then there should be no legitimate reason to read the buffer again (as @jonasnick pointed out above). So nuking and zeroing are different concerns, and neither is a strict subset of the other. What we want in this specific case of zeroing the secnonce is simply both.

We could instead call secp256k1_memzero (after renaming) and then call SECP256K1_CHECKMEM_DEFINE. That's totally fine, and that's my suggestion. But we really should call SECP256K1_CHECKMEM_DEFINE inside the function and not only in the tests. If we provide the API guarantee that the secnonce is zeroed, then it's totally legitimate that the API user reads (if only in a test). And we don't want the user to get a Valgrind false positive when doing that!

We could also call secp256k1_memclear and then memset(..., 0, ...). That's also good. It may be a performance hit, but in any case a tiny one. (On Linux, the compiler should not be able to remove the second memset due to the memory barrier. But I wouldn't care too much.)

edit: @john-moffett Sorry that this PR escalated quickly given that the issue is seemingly simple...

@jonasnick
Copy link
Contributor

If we provide the API guarantee that the secnonce is zeroed, then it's totally legitimate that the API user reads (if only in a test). And we don't want the user to get a Valgrind false positive when doing that!

That's a good point that I hadn't considered.

An alternative to your suggestion is to have both a memclear and memzero function, where the latter differs from the former by guaranteeing that the memory is overwritten with 0s and not calling CHECKMEM_UNDEFINE. We'd clear memory if we never read it again (and therefore do not care what it's being overwritten with) and we zero memory if we care what it's being overwritten with because it might be read again. I think this is cleaner because we wouldn't have to call CHECKMEM_DEFINE in musig_partial_sign (which seems like a code smell in and of itself).

@john-moffett
Copy link
Contributor Author

john-moffett commented Sep 8, 2025

An alternative to your suggestion is to have both a memclear and memzero function, where the latter differs from the former by guaranteeing that the memory is overwritten with 0s and not calling CHECKMEM_UNDEFINE

I was going to make the same suggestion. But is there any value in just reusing memczero and making it non-elidable in the same way memclear is (though SecureZeroMemory would have to be replaced with something suitable for conditional erase)? As a separate matter, is it worth making memczero non-elidable anyway? Perhaps not given its current usage.

@john-moffett Sorry that this PR escalated quickly given that the issue is seemingly simple..

No problem! I should've been aware of the issue from the start, but I've learned a lot, so I'm happy.

At this point, I think the memclear and (new) memzero separation is the cleanest.

@john-moffett
Copy link
Contributor Author

I suppose this usage of memczero suffers from the same potential issue I initially raised in this PR, so maybe memczero ought to be hardened? I agree that it's unlikely that any current compiler would actually DSE any of the calls we've been discussing, but is the possibility enough to warrant a change?

@real-or-random
Copy link
Contributor

real-or-random commented Sep 8, 2025

An alternative to your suggestion is to have both a memclear and memzero function, where the latter differs from the former by guaranteeing that the memory is overwritten with 0s and not calling CHECKMEM_UNDEFINE

Ok, sure, that's another option. Concept ACK .

I was considering that option too, but I didn't mention it because I had the feeling that the existence of two functions may confuse future devs. But now that I think about it, I guess it's fine. Perhaps then we can make it clear from the naming that both variants protect against optimization, e.g., by adding an _explicit (inspired by memzero_explicit in the Linux kernel) or _secure suffix.

But is there any value in just reusing memczero and making it non-elidable in the same way memclear is (though SecureZeroMemory would have to be replaced with something suitable for conditional erase)?

As a separate matter, is it worth making memczero non-elidable anyway? Perhaps not given its current usage.

That's an interesting point you raise.

Regarding its usage: I think it was initially introduced to clear the pubkey when the secret key is invalid, but without leaking the fact that it is. But protecting the fact that it is already "above and beyond" and just defense-in-depth. And perhaps a bit questionable, though it won't hurt.

But all its three usages in musig/session_impl.h (two of them wrapped by _musig_secnonce_invalidate) are for clearing secrets, and this is where it gets interesting. In this case, I believe there's a tension here between protecting against two kinds of side-channels, namely timing/memory access (protecting the secrecy of the flag) vs. someone stealing secrets from a memory dump such as a crash dump (protecting the secrecy of the memory region).

How would a function look like that is both non-elidable and protects the flag? I have no idea, exactly due to platform-specific APIs such as SecureZeroMemory. That's the right way(TM) to clear memory on Windows, and I'm not at all convinced that it's a good idea to replace it, e.g., by a handrolled version. Or by falling back to the "volatile" function pointer trick, which is supposed to be portable, but who knows what compilers will do in the future and what they will do when they encounter "volatile", which pretty much has implementation-defined semantics. The nice thing about SecureMemoryZero is really that we get a guarantee from MSVC not to elide it. So I think memczero should stay entirely separate from memclear and memzero, at least for now.

So if we can't have the best of both worlds, we need to decide what protection we want. Currently, whenever we use memczero, we implicitly favor the protection of the flag, and this seems to be a questionable choice for its usages in musig/session_impl.h (or in general when clearing out secrets). I'd much rather give the attacker that one bit whether the secret key was invalid vs. taking the risk to keep the entire secnonce (or the entire nonce seed) somewhere on the stack. If we believe that this PR is a good idea, i.e., the clearing of the secnonce should better be non-elidable, then I think we should arrive at the same conclusion for the other three uses. And in particular this use here, where the flag is a constant anyway:

secp256k1_musig_secnonce_invalidate(ctx, secnonce, 1);

But this may be a larger discussion, and I'd like to hear @jonasnick's opinion on it. So my suggestion is to leave memczero for a follow-up PR. This PR here could then just add memzero as a variant of memclear, perhaps add a suffix to the names, and use it in _musig_partial_sign.

@john-moffett john-moffett force-pushed the musig-partial-clear-nonce branch from 3c26af6 to 8165ac9 Compare September 8, 2025 16:24
secp256k1_memclear has the side effect of undefining bytes for
valgrind checks. In some cases, we may want to zero bytes
but allow subsequent reads. So we split memclear into
memclear_explicit, which makes no guarantees about the content
of the buffer on return, and memzero_explicit, which guarantees
zero value on return.

Change the memset in partial_sign to use memzero_explicit.
@john-moffett john-moffett force-pushed the musig-partial-clear-nonce branch from 8165ac9 to 399b582 Compare September 8, 2025 16:26
Copy link
Contributor

@real-or-random real-or-random left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 399b582 thanks!

@jonasnick
Copy link
Contributor

jonasnick commented Sep 11, 2025

we use memczero, we implicitly favor the protection of the flag, and this seems to be a questionable choice for its usages in musig/session_impl.h. [...] And in particular this use here, where the flag is a constant anyway

I agree. We have issue #1621 pointing out the inconsistencies in handling constant-timeness with respect to invalid arguments in the library.

Copy link
Contributor

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 399b582

@real-or-random real-or-random merged commit 88be4e8 into bitcoin-core:master Sep 12, 2025
119 checks passed
@theStack
Copy link
Contributor

post-merge ACK 399b582

fanquake added a commit to fanquake/bitcoin that referenced this pull request Oct 14, 2025
a44a339384 Merge bitcoin-core/secp256k1#1750: ci: Use clang-snapshot in "MSan" job
53585f93b7 ci: Use clang-snapshot in "MSan" job
6894c964f3 Fix Clang 21+ `-Wuninitialized-const-pointer` warning when using MSan
2b7337f63a Merge bitcoin-core/secp256k1#1756: ci: Fix image caching and apply other improvements
f163c35897 ci: Set `DEBIAN_FRONTEND=noninteractive`
70ae177ca0 ci: Bump `docker/build-push-action` version
b2a95a420f ci: Drop `tags` input for `docker/build-push-action`
122014edb3 ci: Add `scope` parameter to `cache-{to,from}` options
baa265429f Merge bitcoin-core/secp256k1#1727: docs: Clarify that callback can be called more than once
4d90585fea docs: Improve API docs of _context_set_illegal_callback
895f53d1cf docs: Clarify that callback can be called more than once
de6af6ae35 Merge bitcoin-core/secp256k1#1748: bench: improve context creation in ECDH benchmark
5817885153 Merge bitcoin-core/secp256k1#1749: build: Fix warnings in x86_64 assembly check
ab560078aa build: Fix warnings in x86_64 assembly check
10dab907e7 Merge bitcoin-core/secp256k1#1741: doc: clarify API doc of `secp256k1_ecdsa_recover` return value
dfe284ed2d bench: improve context creation in ECDH benchmark
7321bdf27b doc: clarify API doc of `secp256k1_ecdsa_recover` return value
b475654302 Merge bitcoin-core/secp256k1#1745: test: introduce group order byte-array constant for deduplication
0c91c56041 test: introduce group order byte-array constant for deduplication
88be4e8d86 Merge bitcoin-core/secp256k1#1735: musig: Invalidate secnonce in secp256k1_musig_partial_sign
36e76952cb Merge bitcoin-core/secp256k1#1738: check-abi: remove support for obsolete CMake library output location (src/libsecp256k1.so)
399b582a5f Split memclear into two versions
4985ac0f89 Merge bitcoin-core/secp256k1#1737: doc: mention ctx requirement for `_ellswift_create` (not secp256k1_context_static)
7ebaa134a7 check-abi: remove support for obsolete CMake library output location (src/libsecp256k1.so)
806de38bfc doc: mention ctx requirement for `_ellswift_create` (not secp256k1_context_static)
03fb60ad2e Merge bitcoin-core/secp256k1#1681: doc: Recommend clang-cl when building on Windows
d93380fb35 Merge bitcoin-core/secp256k1#1731: schnorrsig: Securely clear buf containing k or its negation
8113671f80 Merge bitcoin-core/secp256k1#1729: hash: Use size_t instead of int for RFC6979 outlen copy
325d65a8cf Rename and clear var containing k or -k
960ba5f9c6 Use size_t instead of int for RFC6979 outlen copy
737912430d ci: Add more tests for clang-cl
7379a5bed3 doc: Recommend clang-cl when building on Windows
f36afb8b3d Merge bitcoin-core/secp256k1#1725: tests: refactor tagged hash verification
5153cf1c91 tests: refactor tagged hash tests
d2dcf52091 Merge bitcoin-core/secp256k1#1726: docs: fix broken link to Tromer's cache.pdf paper
489a43d1bf docs: fix broken link to eprint cache.pdf paper
d599714147 Merge bitcoin-core/secp256k1#1722: docs: Exclude modules' `bench_impl.h` headers from coverage report
0458def51e doc: Add `--gcov-ignore-parse-errors=all` option to `gcovr` invocations
1aecce5936 doc: Add `--merge-mode-functions=separate` option to `gcovr` invocations
106a7cbf41 doc: Exclude modules' `bench_impl.h` headers from coverage report
a9e955d3ea autotools, docs: Adjust help string for `--enable-coverage` option
e523e4f90e Merge bitcoin-core/secp256k1#1720: chore(ci): Fix typo in Dockerfile comment
24ba8ff168 chore(ci): Fix typo in Dockerfile comment
74b8068c5d Merge bitcoin-core/secp256k1#1717: test: update wycheproof test vectors
c25c3c8a88 test: update wycheproof test vectors
20e3b44746 Merge bitcoin-core/secp256k1#1688: cmake: Avoid contaminating parent project's cache with `BUILD_SHARED_LIBS`
2c076d907a Merge bitcoin-core/secp256k1#1711: tests: update Wycheproof
7b07b22957 cmake: Avoid contaminating parent project's cache with BUILD_SHARED_LIBS
5433648ca0 Fix typos and spellings
9ea54c69b7 tests: update Wycheproof files

git-subtree-dir: src/secp256k1
git-subtree-split: a44a339384e1e4b1c0ed7fa59e2857b057f075bf
fanquake added a commit to fanquake/bitcoin that referenced this pull request Oct 15, 2025
d543c0d917 Merge bitcoin-core/secp256k1#1734: Introduce (mini) unit test framework
f44c1ebd96 Merge bitcoin-core/secp256k1#1719: ci: DRY workflow using anchors
a44a339384 Merge bitcoin-core/secp256k1#1750: ci: Use clang-snapshot in "MSan" job
15d014804e ci: Drop default for `inputs.command` in `run-in-docker-action`
1decc49a1f ci: Use YAML anchor and aliases for repeated "CI script" steps
dff1bc107d ci, refactor: Generalize use of `matrix.configuration.env_vars`
4b644da199 ci: Use YAML anchor and aliases for repeated "Print logs" steps
a889cd93df ci: Bump `actions/checkout` version
574c2f3080 ci: Use YAML anchor and aliases for repeated "Checkout" steps
53585f93b7 ci: Use clang-snapshot in "MSan" job
6894c964f3 Fix Clang 21+ `-Wuninitialized-const-pointer` warning when using MSan
2b7337f63a Merge bitcoin-core/secp256k1#1756: ci: Fix image caching and apply other improvements
f163c35897 ci: Set `DEBIAN_FRONTEND=noninteractive`
70ae177ca0 ci: Bump `docker/build-push-action` version
b2a95a420f ci: Drop `tags` input for `docker/build-push-action`
122014edb3 ci: Add `scope` parameter to `cache-{to,from}` options
2f4546ce56 test: add --log option to display tests execution
95b9953ea4 test: Add option to display all available tests
953f7b0088 test: support running specific tests/modules targets
0302c1a3d7 test: add --help for command-line options
9ec3bfe22d test: adapt modules to the new test infrastructure
48789dafc2 test: introduce (mini) unit test framework
baa265429f Merge bitcoin-core/secp256k1#1727: docs: Clarify that callback can be called more than once
4d90585fea docs: Improve API docs of _context_set_illegal_callback
895f53d1cf docs: Clarify that callback can be called more than once
de6af6ae35 Merge bitcoin-core/secp256k1#1748: bench: improve context creation in ECDH benchmark
5817885153 Merge bitcoin-core/secp256k1#1749: build: Fix warnings in x86_64 assembly check
ab560078aa build: Fix warnings in x86_64 assembly check
10dab907e7 Merge bitcoin-core/secp256k1#1741: doc: clarify API doc of `secp256k1_ecdsa_recover` return value
dfe284ed2d bench: improve context creation in ECDH benchmark
7321bdf27b doc: clarify API doc of `secp256k1_ecdsa_recover` return value
b475654302 Merge bitcoin-core/secp256k1#1745: test: introduce group order byte-array constant for deduplication
9cce703863 refactor: move 'gettime_i64()' to tests_common.h
0c91c56041 test: introduce group order byte-array constant for deduplication
88be4e8d86 Merge bitcoin-core/secp256k1#1735: musig: Invalidate secnonce in secp256k1_musig_partial_sign
36e76952cb Merge bitcoin-core/secp256k1#1738: check-abi: remove support for obsolete CMake library output location (src/libsecp256k1.so)
399b582a5f Split memclear into two versions
4985ac0f89 Merge bitcoin-core/secp256k1#1737: doc: mention ctx requirement for `_ellswift_create` (not secp256k1_context_static)
7ebaa134a7 check-abi: remove support for obsolete CMake library output location (src/libsecp256k1.so)
806de38bfc doc: mention ctx requirement for `_ellswift_create` (not secp256k1_context_static)
03fb60ad2e Merge bitcoin-core/secp256k1#1681: doc: Recommend clang-cl when building on Windows
d93380fb35 Merge bitcoin-core/secp256k1#1731: schnorrsig: Securely clear buf containing k or its negation
8113671f80 Merge bitcoin-core/secp256k1#1729: hash: Use size_t instead of int for RFC6979 outlen copy
325d65a8cf Rename and clear var containing k or -k
960ba5f9c6 Use size_t instead of int for RFC6979 outlen copy
737912430d ci: Add more tests for clang-cl
7379a5bed3 doc: Recommend clang-cl when building on Windows
f36afb8b3d Merge bitcoin-core/secp256k1#1725: tests: refactor tagged hash verification
5153cf1c91 tests: refactor tagged hash tests
d2dcf52091 Merge bitcoin-core/secp256k1#1726: docs: fix broken link to Tromer's cache.pdf paper
489a43d1bf docs: fix broken link to eprint cache.pdf paper
d599714147 Merge bitcoin-core/secp256k1#1722: docs: Exclude modules' `bench_impl.h` headers from coverage report
0458def51e doc: Add `--gcov-ignore-parse-errors=all` option to `gcovr` invocations
1aecce5936 doc: Add `--merge-mode-functions=separate` option to `gcovr` invocations
106a7cbf41 doc: Exclude modules' `bench_impl.h` headers from coverage report
a9e955d3ea autotools, docs: Adjust help string for `--enable-coverage` option
e523e4f90e Merge bitcoin-core/secp256k1#1720: chore(ci): Fix typo in Dockerfile comment
24ba8ff168 chore(ci): Fix typo in Dockerfile comment
74b8068c5d Merge bitcoin-core/secp256k1#1717: test: update wycheproof test vectors
c25c3c8a88 test: update wycheproof test vectors
20e3b44746 Merge bitcoin-core/secp256k1#1688: cmake: Avoid contaminating parent project's cache with `BUILD_SHARED_LIBS`
2c076d907a Merge bitcoin-core/secp256k1#1711: tests: update Wycheproof
7b07b22957 cmake: Avoid contaminating parent project's cache with BUILD_SHARED_LIBS
5433648ca0 Fix typos and spellings
9ea54c69b7 tests: update Wycheproof files

git-subtree-dir: src/secp256k1
git-subtree-split: d543c0d917a76a201578948701cc30ef336e0fe6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants