-
Notifications
You must be signed in to change notification settings - Fork 695
Padding/Byte alignment for binary file format #626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is specified here: https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md#varuint32 |
No it is NOT. I tried to explain my issue and you just want to close any issue as fast as possible, without reading or understanding the whole thing. Ok understood, I will never post again. You can go back to sleep... |
My apologies. I thought you were asking where this behavior was specified, but it sounds like you're asking to change the LEB128 encoding to require canonicalized LEB values? This was discussed here #562 and here #564. Allowing padding makes it easier to write a single pass encoder without having to shift data down. |
Well, I just want to know what is the formula used to say "Ok we'll pad with x bytes this time for this specific section". What are the rules. It is not related to the LEB128, it is related to the way wasm modules are written. Why 3 bytes for a section, why 4 bytes for another, why 2 bytes for the next one. |
There are no explicit rules, other than that padding a LEB128 value with zeroes is allowed. For example, https://github.com/WebAssembly/sexpr-wasm-prototype/ has a flag to pad unsigned LEB128 values or to canonicalize them. SpiderMonkey (which was used to generate the files in https://github.com/WebAssembly/build-suite) always pads section sizes to 5 bytes. |
On interesting consequence of #601 is that some alternative varuint32 encodings don't support padding. |
Hi,
I'm playing a bit with the wasm binary file format, and I think something important is missing in the documentation. (Or perhaps I am missing something).
Let's take an example in the build suite:
https://github.com/WebAssembly/build-suite/blob/master/emscripten/hello_world/a.out.wasm
For (I think) padding and alignment purposes, the padding feature of the LEB128 encoding is used when writing section sizes.
In the file above, the first "signature" section has a size of 41 bytes.
Using LEB128, the resulting binary should be 0x29
But in the file it is encoded as 0xA9 0x80 0x80 0x80 0x00
Thanks to the LEB128 padding support, when decoded, we have the same value '41' (0x80 is used for padding with LEB128, 0 is the end marker).
It would be great to have the exact padding/alignment strategy explained in the documentation, so that we can write tools able to write byte-perfect wasm files.
Thanks for your help!
The text was updated successfully, but these errors were encountered: