-
Notifications
You must be signed in to change notification settings - Fork 695
add opcode definitions section #237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* the generic section header | ||
* a table containing, for each opcode-space, a standardized string literal | ||
type name (where index defines its type), offset (within the section), | ||
sorted by offset, followed by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make these sub-bullets?
What do you mean by "index defines its type"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that when we have places where we need to reference a type (e.g. function definitions), we don't want to put "int32", would rather put 0 (if for example the int32 opcodes were first in this list).
What does it mean to have multiple sections with functions in each? Can one section's function call a function in another section? Or do we just have one code section for now? How do I access different data sections? It looks like we're close to a container format... This is related to #74 about using ELF. The definition is also beginning to look like BNF! |
I don't see a logical reason to have more than one code section right now, so I think limiting to one should be ok.
I guess there are a couple ways we could go about it. If different data section types are all singletons (e.g. you can only have a single import sections), then you can directly access it and decoder will know where to look when you ask for import[0]. Otherwise you can indirect through the section list for the section you want and access your data from there (not quite as efficient though).
I don't really know ELF, but the conversation made it sound like the format will not map well for us. So I'd like to move forward with something else hedging against it. |
ELF isn't yet ruled out. These various tables could just map to special sections/segments in ELF. But I don't think we need to worry about that now. Let's design what we want first, and figure out whether ELF makes sense once we have that. |
* a table (sorted by offset) containing, for each section, its type and offset (within the module), followed by | ||
* a sequence of sections. | ||
* A module contains (in this order): | ||
* A header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You caught me. I don't know what the headers contain. At the very least, the module header contains the magic number, but besides that I don't really have anything in particular decided. Maybe some things like whether heap is 32/64 bit, source language (for ABI), and entry point. This will need to be figured out, but based on the level of detail in the rest of our design docs I'm not sure how much detail to go in here.
Maybe I should just mention some things like this as "ideas" for what a header would contain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds good. I like what you're adding overall, so I'll step back, take this improvement, and we can iterate later :-)
Maybe I'm going into too many details? |
@jfbastien Maybe a bit with null terminated UTF8 (which is what I had in mind, but thought I was already bordering on too verbose), but you are right that "header" and "type" deserved some clarifications. |
lgtm |
* A module contains (in this order): | ||
- A header, containing: | ||
+ The [magic number](https://en.wikipedia.org/wiki/Magic_number_%28programming%29) | ||
+ Other data TBD (possibly entrypoint, memory bitness, source language, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is memory bitness?
At an initial glance, source language seems like something we'd specifically try to avoid including in the main header, because it suggests special magical per-source-language semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is memory bitness?
I mean whether your linear memory has a 64 bit or 32 bit address space (i.e. whether ptr type is int32 or int64). Maybe not necessary here, but was just an idea for something we might want. I think we might need some version info for the module format as well. We may never need to break compat, but I can imagine a scenario where we eventually want to make format changes, and having a byte to allow for that would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just say other data TBD for now and remove the rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the AstSemantics.md, we've just been assuming that one can index the
linear memory with either 32-bit or 64-bit offsets. In the v8 native
prototype, there are different bytecode numbers for whether the memory
offset operand is an Int32 or an Int64.
On Tue, Jun 30, 2015 at 10:29 PM, Michael Holman [email protected]
wrote:
In BinaryEncoding.md
#237 (comment):@@ -65,20 +65,40 @@ Yes:
Global structure
-* A module contains:
- * a header followed by
- * a table (sorted by offset) containing, for each section, its type and offset (within the module), followed by
- * a sequence of sections.
+* A module contains (in this order):
- A header, containing:
- The magic number
- Other data TBD (possibly entrypoint, memory bitness, source language, etc.)
What is memory bitness?
I mean whether your linear memory has a 64 bit or 32 bit address space
(i.e. whether ptr type is int32 or int64). Maybe not necessary here, but
was just an idea for something we might want. I think we might need some
version info for the module format as well. We may never need to break
compat, but I can imagine a scenario where we eventually want to make
format changes, and having a byte to allow for that would be useful.—
Reply to this email directly or view it on GitHub
https://github.com/WebAssembly/design/pull/237/files#r33619752.
I think this is good to go, we can iterate on top. |
add opcode definitions section
No description provided.