Prefer `calloc` over of malloc+zeroMemory. NFC #22460

sbc100 · 2024-08-27T22:07:24Z

IIUC, there are cases where the allocator can take advantage of the fact that it knows the memory is fresh and avoid the memset completely.

kripken

I don't know if we actually benefit from this atm, but we could with some work. At least mimalloc could benefit iirc, if we tracked the last sbrk point, and then reasoned that each sbrk increment is fresh zeroed memory.

kripken · 2024-08-27T23:31:55Z

test/other/codesize/test_codesize_hello_dylink.size

@@ -1 +1 @@
-9347
+9760


Is calloc really 413 bytes larger than malloc? That seems surprising.

I don't think its a huge deal the JS size went down a few bytes the native size went up a few bytes.

Also this is the only code size test in the test suite that was effected, so it only appear to effect the dyanmic linking using case.. at least that is the only test where any change occured.

Hmm, but it is more than a few bytes: 413 bytes seems quite a lot for calloc to add over malloc. It should just have a loop that writes zeroes? I'm worries something else is going on there.

(The JS change in the other direction is only 9 bytes, which does make sense.)

sbc100 · 2024-08-28T00:13:07Z

I don't know if we actually benefit from this atm, but we could with some work. At least mimalloc could benefit iirc, if we tracked the last sbrk point, and then reasoned that each sbrk increment is fresh zeroed memory.

Agreed, that is a forward looking goal. Regarding this makes the code more compact (just one call instead of two), and more clearly shows intent, so I think its worth landing purely on that basis.

juj · 2024-08-28T09:46:45Z

I looked into this pristine memory optimization for emmalloc some time ago. The takeaway is that it is not really that helpful, since sbrk grows occur diminishingly rarely compared to reallocating memory in a program. The overhead of keeping track of pristine memory is nonzero.

Only after we get memory.discard to free up pages (if that ever becomes a thing), then we can really optimize calloc implementation.

One thing I worry with assuming everything above sbrk would be free is that we have users who say they rerun the application on their web page by reinitializing it. I am not sure exactly how they go about doing this, but if they might be rerunning static ctors i.e. effectively rerunning the program on a now dirty heap, then the pristine sbrk assumption would no longer hold. We would need memory.discard support for that guarantee to happen.

Nevertheless, using calloc in JS library code instead of zeroMemory is a nice cleanup.

sbc100 · 2024-08-28T14:36:54Z

One thing I worry with assuming everything above sbrk would be free is that we have users who say they rerun the application on their web page by reinitializing it. I am not sure exactly how they go about doing this, but if they might be rerunning static ctors i.e. effectively rerunning the program on a now dirty heap, then the pristine sbrk assumption would no longer hold. We would need memory.discard support for that guarantee to happen.

Regarding this particular point. When the wasm module itself creates and exports its memory we already rely on the fact that the initial memory is clean. When the wasm module imports its memory we do not make this assumption. For example when the module initially loads we have a BSS segment that contains all zeros will initialize the static data region to zero. In the case we know the memory is fresh (i.e. the memory itself is created within the module) we simple elide the BSS section since it would serve no purpose to initialize the fresh memory with zeros.

If a user wants to take an existing instance and somehow restore it to it initial state they already have their work cut for them. They would need to somehow restore all the static data to its initial state. I don't know of any way to do that.. since the data data is stored in active segments (i.e. segments that are active on initialization, cannot be used with the data.init instruction later on). So such as user would have find a way to re-iniitalize static data. Asking such users to also zero the whole heap when the "rewind" the sbkr pointer seems like one of the easier parts of such a multi-step, and error-prone, process.

dschuff · 2024-08-28T19:16:20Z

+1 that this is a nice cleanup and worth doing. But also that it would be good to understand why the wasm code size increases as much as it does.
Also, the plan of record is still to enable bulk memory as soon as we get around to turning the crank and working out the kinks, so it would be good to see the effect with that enabled, maybe it's better.

IIUC, there are cases where the allocator can take advantage of the fact that it knows the memory is fresh and avoid the memset completely.

sbc100 · 2024-09-07T00:01:05Z

It looks like the codesize regression in that one tests is simply the delta between dlmalloc and dlcalloc.

I've attached the old and new wast files and the diff.

files.zip

kripken · 2024-09-09T20:29:27Z

I've attached the old and new wast files and the diff.

Looks like calloc adds an overflow check (which I was not aware the spec required, of the multiplication of num * size) and a call to memset, so almost all the size is due to memset it seems.

memset does seem surprisingly large. When we enable bulk memory by default that can shrink, at least?

This reverts commit 6619b4a.

Reverts #22460 It's causing ASan/LSan failures: https://ci.chromium.org/ui/p/emscripten-releases/builders/ci/linux-test-suites/b8737231127543572609/overview I'm going to revert rather than fixing forward because I'm preparing a release.

sbc100 requested review from kripken and dschuff August 27, 2024 22:07

sbc100 force-pushed the calloc branch 2 times, most recently from 6ef9c14 to 8c40b7d Compare August 27, 2024 22:51

kripken reviewed Aug 27, 2024

View reviewed changes

sbc100 force-pushed the calloc branch from 8c40b7d to aa9d61d Compare August 28, 2024 00:10

sbc100 force-pushed the calloc branch from aa9d61d to 2e01f4b Compare August 28, 2024 15:58

sbc100 force-pushed the calloc branch from 2e01f4b to 549f838 Compare August 28, 2024 23:27

Prefer calloc over of malloc+zeroMemory. NFC

fa01e82

IIUC, there are cases where the allocator can take advantage of the fact that it knows the memory is fresh and avoid the memset completely.

sbc100 force-pushed the calloc branch from 549f838 to fa01e82 Compare September 7, 2024 00:01

dschuff approved these changes Sep 9, 2024

View reviewed changes

sbc100 merged commit 6619b4a into emscripten-core:main Sep 9, 2024
28 checks passed

sbc100 deleted the calloc branch September 9, 2024 20:21

dschuff added a commit that referenced this pull request Sep 13, 2024

Revert "Prefer calloc over of malloc+zeroMemory. NFC (#22460)"

ca17bcd

This reverts commit 6619b4a.

dschuff mentioned this pull request Sep 13, 2024

Revert "Prefer calloc over of malloc+zeroMemory. NFC" #22568

Merged

dschuff mentioned this pull request Oct 29, 2024

Allow more features for maybe-wasm2js and wasm-split #22799

Merged

dschuff mentioned this pull request Nov 15, 2024

Fix up more SSE implementations for nontrapping-fp #22931

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prefer `calloc` over of malloc+zeroMemory. NFC #22460

Prefer `calloc` over of malloc+zeroMemory. NFC #22460

Uh oh!

sbc100 commented Aug 27, 2024

Uh oh!

kripken left a comment

Uh oh!

kripken Aug 27, 2024

Uh oh!

sbc100 Aug 28, 2024

Uh oh!

kripken Aug 28, 2024

Uh oh!

sbc100 commented Aug 28, 2024

Uh oh!

juj commented Aug 28, 2024

Uh oh!

sbc100 commented Aug 28, 2024

Uh oh!

dschuff commented Aug 28, 2024

Uh oh!

sbc100 commented Sep 7, 2024

Uh oh!

Uh oh!

kripken commented Sep 9, 2024 •

edited

Loading

Uh oh!

Uh oh!

		@@ -1 +1 @@
		9347
		9760

Prefer calloc over of malloc+zeroMemory. NFC #22460

Prefer calloc over of malloc+zeroMemory. NFC #22460

Uh oh!

Conversation

sbc100 commented Aug 27, 2024

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

kripken Aug 27, 2024

Choose a reason for hiding this comment

Uh oh!

sbc100 Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

kripken Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

sbc100 commented Aug 28, 2024

Uh oh!

juj commented Aug 28, 2024

Uh oh!

sbc100 commented Aug 28, 2024

Uh oh!

dschuff commented Aug 28, 2024

Uh oh!

sbc100 commented Sep 7, 2024

Uh oh!

Uh oh!

kripken commented Sep 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Prefer `calloc` over of malloc+zeroMemory. NFC #22460

Prefer `calloc` over of malloc+zeroMemory. NFC #22460

kripken commented Sep 9, 2024 •

edited

Loading