Skip to content

_WASI_EMULATED_MMAN: mmap does not return aligned memory #207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sbc100 opened this issue Jul 8, 2020 · 12 comments
Open

_WASI_EMULATED_MMAN: mmap does not return aligned memory #207

sbc100 opened this issue Jul 8, 2020 · 12 comments

Comments

@sbc100
Copy link
Member

sbc100 commented Jul 8, 2020

The memory returned from mmap is supposed to be page aligned.

The file offset passed in is also required to be page aligned.

@sbc100
Copy link
Member Author

sbc100 commented Jul 29, 2020

Fixing doesn't look easy because it hides metadata in the first N bytes of its allocation.

@sunfishcode
Copy link
Member

Agreed. And in most cases, page-aligning wouldn't matter, because there's no mprotect or anything else which requires aligned memory. The memory is always 16-byte aligned (courtesy of malloc), which is enough for many purposes. And for programs that don't don't need it, 64KiB page alignment could waste significant amounts of memory.

@sbc100
Copy link
Member Author

sbc100 commented Jul 30, 2020

I think that actual alignment will depend on alignof(struct map) since we do a += 1 on the result of malloc. we should confirm that is still 16byte sligned. I guess a real fix to align to page size might not be worth it, as you say.

@sunfishcode
Copy link
Member

Ah, good catch. struct map has _Alignof 8 and sizeof 24. Seems like we should pad it out to the nearest multiple of 16, which would be 32.

@gzurl
Copy link

gzurl commented Apr 10, 2023

Hi, as part of the WLR project, we found a couple of use cases where aligned memory is needed for mmap(): SQLite and PHP (Zend parser). In both cases, reading from files seems optimized at low level to gain performance. So, maybe padding after metadata as you guys suggest could solve the issue (CCing: @ereslibre, @assambar).

In SQLite, lockBtree() fails when reading a DB from file in WAL mode.

@sunfishcode
Copy link
Member

Indeed, that's a bug. We should use posix_memalign to allocate the memory instead of malloc. I'll submit a PR.

@sunfishcode
Copy link
Member

Or, oh, I missed the context here. Yes, we still need to think about padding after the header. And with wasm32's 64 KiB page sizes, that's a lot of padding.

@gzurl If the goal here is to optimize at a low level for performance, calling mmap on wasm is not going to get you that, because it isn't a real mmap. wasi-libc is just doing malloc and read. So if SQLite and PHP have the option of disabling the use of mmap, that's very likely going to produce more optimal results than using wasi-libc's emulation.

@gzurl
Copy link

gzurl commented Apr 11, 2023

@sunfishcode you are right, since this is a limited emulation of mmap, it won't improve performance as the real one.

Regarding how it's used in SQLite and PHP, as you know, mmap can be used in very different ways: to map a file into memory and share it between processes, as an I/O alternative (instead of fopen, fseek, ...), just to allocating memory, etc. To be more specific:

  • SQLite makes use of mmap especially for the WAL mode. So, in this particular case, as you pointed out, it's better (and doable) to just disable it. In v3.39.2, we disabled WAL by setting proper CFLAGS. Since v3.41.0, now you can just ./configure --with-wasi-sdk=$WASI_SDK_PATH --enable-all and it will automatically disable WAL in sqlite.h.

  • In PHP, we found that its parser relied on mmap for different memory allocations. When trying the wasi-libc's emulated mmap, we identified several memory alignment issues. Since disabling mmap was not a feasible option (PHP's parser is critical for its purpose), we decided to substitute it with aligned_alloc (see example here).

Given this context, we wonder about similar cases to PHP. If we had a -D_WASI_EMULATED_MMAN_ALIGNED option that emulated mmap but returned aligned memory, even with the memory waste concern, we would likely have used it for PHP. We think this should be considered case-by-case, so maybe having those two different emulation modes available (with/without aligned memory) could be an option. WDYT?

PS: Linking with a related issue in WASI #304.

@sunfishcode
Copy link
Member

I think the current mmap behavior of not aligning and thus not wasting a large amount of memory seems like the best default for most users, but I think a PR that adds a macro like -D_WASI_EMULATED_MMAP_ALIGNED as an option, with documentation describing the implied memory usage, would be reasonable.

@sbc100
Copy link
Member Author

sbc100 commented May 12, 2023

Given that wasm-libc's mmap is known is fake and non-optimal, perhaps we can see it as a stop gab for folks who don't want to take the extra time to add a non-mmap-based codepath for their code?

Given that it is already emulated/fake/not recommended for production, why not just take the 64-kb hit per allocation by default? This would give us something that works out-of-the-box, but is not completely optimal (i.e. the goal the fake mmap). It would then be incumbent upon folks that care about optimization to remove their usage of our fake mmap.

@sunfishcode
Copy link
Member

That's reasonable. And it does make wasi-libc simpler if we avoid adding a new config macro. So then yes, we should change wasi-libc to page-align.

@sbc100
Copy link
Member Author

sbc100 commented May 12, 2023

I really like the fact the folks have to opt into _WASI_EMULATED_MMAN. It makes it fairly clear that its a non-optimal path.

agoode added a commit to agoode/mlton that referenced this issue Feb 25, 2024
It does not return aligned memory (WebAssembly/wasi-libc#207) and we don't need it for anything fancy.

Instead, just use posix_memalign() and free() for the GC functions.
agoode added a commit to agoode/mlton that referenced this issue Feb 25, 2024
It does not return aligned memory (WebAssembly/wasi-libc#207) and we don't need it for anything fancy.

Instead, just use posix_memalign() and free() for the GC functions.
agoode added a commit to agoode/mlton that referenced this issue Feb 25, 2024
It does not return aligned memory (WebAssembly/wasi-libc#207) and we don't need it for anything fancy.

Instead, just use posix_memalign() and free() for the GC functions.
agoode added a commit to agoode/mlton that referenced this issue Feb 25, 2024
It does not return aligned memory (WebAssembly/wasi-libc#207) and we don't need it for anything fancy.

Instead, just use posix_memalign() and free() for the GC functions.
agoode added a commit to agoode/mlton that referenced this issue Feb 25, 2024
It does not return aligned memory (WebAssembly/wasi-libc#207) and we don't need it for anything fancy.

Instead, just use posix_memalign() and free() for the GC functions.
agoode added a commit to agoode/mlton that referenced this issue Mar 15, 2024
It does not return aligned memory (WebAssembly/wasi-libc#207) and we don't need it for anything fancy.

Instead, just use posix_memalign() and free() for the GC functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants