-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Prefer calloc
over of malloc+zeroMemory. NFC
#22460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6ef9c14
to
8c40b7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we actually benefit from this atm, but we could with some work. At least mimalloc could benefit iirc, if we tracked the last sbrk point, and then reasoned that each sbrk increment is fresh zeroed memory.
@@ -1 +1 @@ | |||
9347 | |||
9760 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is calloc really 413 bytes larger than malloc? That seems surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think its a huge deal the JS size went down a few bytes the native size went up a few bytes.
Also this is the only code size test in the test suite that was effected, so it only appear to effect the dyanmic linking using case.. at least that is the only test where any change occured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, but it is more than a few bytes: 413 bytes seems quite a lot for calloc to add over malloc. It should just have a loop that writes zeroes? I'm worries something else is going on there.
(The JS change in the other direction is only 9 bytes, which does make sense.)
Agreed, that is a forward looking goal. Regarding this makes the code more compact (just one call instead of two), and more clearly shows intent, so I think its worth landing purely on that basis. |
I looked into this pristine memory optimization for emmalloc some time ago. The takeaway is that it is not really that helpful, since sbrk grows occur diminishingly rarely compared to reallocating memory in a program. The overhead of keeping track of pristine memory is nonzero. Only after we get memory.discard to free up pages (if that ever becomes a thing), then we can really optimize calloc implementation. One thing I worry with assuming everything above sbrk would be free is that we have users who say they rerun the application on their web page by reinitializing it. I am not sure exactly how they go about doing this, but if they might be rerunning static ctors i.e. effectively rerunning the program on a now dirty heap, then the pristine sbrk assumption would no longer hold. We would need memory.discard support for that guarantee to happen. Nevertheless, using calloc in JS library code instead of zeroMemory is a nice cleanup. |
Regarding this particular point. When the wasm module itself creates and exports its memory we already rely on the fact that the initial memory is clean. When the wasm module imports its memory we do not make this assumption. For example when the module initially loads we have a BSS segment that contains all zeros will initialize the static data region to zero. In the case we know the memory is fresh (i.e. the memory itself is created within the module) we simple elide the BSS section since it would serve no purpose to initialize the fresh memory with zeros. If a user wants to take an existing instance and somehow restore it to it initial state they already have their work cut for them. They would need to somehow restore all the static data to its initial state. I don't know of any way to do that.. since the data data is stored in active segments (i.e. segments that are active on initialization, cannot be used with the |
+1 that this is a nice cleanup and worth doing. But also that it would be good to understand why the wasm code size increases as much as it does. |
IIUC, there are cases where the allocator can take advantage of the fact that it knows the memory is fresh and avoid the memset completely.
It looks like the codesize regression in that one tests is simply the delta between I've attached the old and new wast files and the diff. |
Looks like calloc adds an overflow check (which I was not aware the spec required, of the multiplication of memset does seem surprisingly large. When we enable bulk memory by default that can shrink, at least? |
Reverts #22460 It's causing ASan/LSan failures: https://ci.chromium.org/ui/p/emscripten-releases/builders/ci/linux-test-suites/b8737231127543572609/overview I'm going to revert rather than fixing forward because I'm preparing a release.
IIUC, there are cases where the allocator can take advantage of the fact that it knows the memory is fresh and avoid the memset completely.