Skip to content

More Module information #1046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jfbastien opened this issue Apr 20, 2017 · 16 comments
Closed

More Module information #1046

jfbastien opened this issue Apr 20, 2017 · 16 comments

Comments

@jfbastien
Copy link
Member

jfbastien commented Apr 20, 2017

Right now we have WebAssembly.Module.imports as well as .exports and .customSections. API here.

It would be nice to offer more information from all sections. A simple one would be the memory initial / maximum (or undefined if not present), but basically being able to reflect onto the sections would be cool.

How should we tackle this? It seems to me that the 3 existing functions are kinda mis-designed because they make it weirdly inconsistent to add other section informations. It seems that, were we complete in what we offer, we'd want something like:

let module = WebAssembly.Module(...);
for (let section in module.sections)
    switch (section) {
    case "memory": print(`${module.sections[section].initial} ${typeof module.sections[section].maximum !== "undefined" ? module.sections[section].maximum : "undefined"}`); break;
    case ...
    }

Or, using static things:

print(`${WebAssembly.Module.memory(module, 0 /* memory #0 */).initial}`);

Why is this useful? Anything reflection is usually used for: toolchains, logging, stubbing for tests, etc.

Here's where I wanted it today: trying to repro a crash on a particularly big binary, running in-browser. I have the .wasm and don't want to redo the import object and its complex creation, I just want to stub a few things out. Just iterate imports / exports and stub everything. can almost do it... except Memory initial and Table initial.

Here's even some code:

let importObject = {};
for (let imp of WebAssembly.Module.imports(module)) {
    if (typeof importObject[imp.module] === "undefined")
        importObject[imp.module] = {};
    switch (imp.kind) {
    case "function": importObject[imp.module][imp.name] = () => {}; break;
    case "table": importObject[imp.module][imp.name] = new WebAssembly.Table({ initial: ???, maximum: ???, element: "anyfunc" }); break;
    case "memory": importObject[imp.module][imp.name] = new WebAssembly.Memory({ initial: ??? }); break;
    case "global": importObject[imp.module][imp.name] = 0; break;
    }
}

EDIT updated to static, I'm misread the current API :)

@lukewagner
Copy link
Member

How are .imports and .exports implemented incorrectly? Something tells me you're expecting them to be properties of Module instances... (but they're not; they're static methods on the Module constructor, done for symmetry with the reflecting functions on the Object constructor).

@jfbastien
Copy link
Member Author

jfbastien commented Apr 20, 2017

How are .imports and .exports implemented incorrectly? Something tells me you're expecting them to be properties of Module instances... (but they're not; they're static methods on the Module constructor, done for symmetry with the reflecting functions on the Object constructor).

Yeah I expected it not to be static. Weird :)
I updated the description.

@lukewagner
Copy link
Member

So the main motivation for imports/exports is that these are fields that any JS module loader would need (and iiuc, experimental SystemJS is using them already for this purpose). What's nice about them being their own methods is that there's a clear cost that when you call them, you're just allocating memory for the returned array. With an all-in-one .sections object, I'd be worried that we'd have to go to lengths (with proxies etc) to be lazy about allocating field values b/c you wouldn't want calling module.sections['imports'] to allocate a copy of all custom sections.

But adding more reflective methods sounds reasonable, especially paired with a rationale for why a module loader or other tool would need them.

@jfbastien
Copy link
Member Author

Right, and in what I'm missing above it's sufficient to add information to the imports result when they're memories and tables. We can add this in one pass, and then address the wider proposal I have separately.

@rossberg
Copy link
Member

rossberg commented Apr 21, 2017

I am confused about your motivation. Your example does not motivate adding reflection on other sections. That would not help your case at all, because the memory section does not provide info about a memory import -- it doesn't even exist in the presence of such an import.

What you want is filling the remaining hole in the reflection of imports and include their type signatures, an extension that is already anticipated in the description of the current API. I think such an addition makes a lot of sense, for the reasons you bring up.

Reflection on internal sections is a different story. I'd be very wary of adding it without solid use cases, because it can mean a lot of API surface, and IME any such reflection is typically mostly abused in practice. Making it easy to look into the internals of modules invites breaking encapsulation, for example. It can also lead to unnecessary confusion, as your very example demonstrates. ;)

@jfbastien
Copy link
Member Author

What you want is filling the remaining hole in the reflection of imports and include their type signatures, an extension that is already anticipated in the description of the current API. I think such an addition makes a lot of sense, for the reasons you bring up.

What do you mean concretely?

@rossberg
Copy link
Member

rossberg commented Apr 21, 2017

That the descriptor objects returned by the imports method include a field holding the type information as described here or here.

@jfbastien
Copy link
Member Author

That the descriptor objects returned by the imports method include a field holding the type information as described here or here.

Can you clarify with the precise API you're proposing? Sample API or wording please.

@rossberg
Copy link
Member

rossberg commented Apr 21, 2017

The records returned by imports would have an additional field; JS.md currently suggests naming it signature, but I'd probably prefer type.

Depending on the kind field, that type field would contain a JS rendering of the respective function type, table type, memory type, or global type. How exactly we render them would be up for design, but some JSON-style representation of their abstract syntax is an obvious possibility.

(IIRC, not getting sidetracked by designing a representation of Wasm types in JS at the time was the only reason why @lukewagner deferred adding this information when he originally proposed the method.)

@rossberg
Copy link
Member

For example, concretely, we could render the relevant types as follows (using TypeScriptish notation):

type ValueType = "i32" | "i64" | "f32" | "f64"
type FunctionType = {params: ValueType[], results: ValueType[]}
type GlobalType = {value: ValueType, mutable: boolean}
type MemoryType = {limits: Limits}
type TableType = {limits: Limits, element: ElementType}
type ElementType = "anyfunc"
type Limits = {min: number, max?: number}

The imports method would return frozen objects of this form, so that they can be suitably cached and shared.

Another option would be to introduce API classes for each of the above, but I'm not sure there is much benefit in doing so, and it would increase the API surface significantly.

@jfbastien
Copy link
Member Author

@rossberg-chromium

The records returned by imports would have an additional field; JS.md currently suggests naming it signature, but I'd probably prefer type.

Depending on the kind field, that type field would contain a JS rendering of the respective function type, table type, memory type, or global type. How exactly we render them would be up for design, but some JSON-style representation of their abstract syntax is an obvious possibility.

(IIRC, not getting sidetracked by designing a representation of Wasm types in JS at the time was the only reason why @lukewagner deferred adding this information when he originally proposed the method.)

Yes. This (and the TypeScriptish thing you mention) sound like a great general direction.

Another option would be to introduce API classes for each of the above, but I'm not sure there is much benefit in doing so, and it would increase the API surface significantly.

Yeah that's why I didn't like the static method approach, it's much nicer to just slap more information in the already-existing object IMO.

I am confused about your motivation. Your example does not motivate adding reflection on other sections. That would not help your case at all, because the memory section does not provide info about a memory import -- it doesn't even exist in the presence of such an import.

Correct, I offered an example to motivate a subset of what I propose. I don't think this make other reflection information irrelevant or undesirable. I agree that we want some amount of motivation for features, and I'm hoping @dschuff and @kripken or other tooling folks can chime in.

Reflection on internal sections is a different story. I'd be very wary of adding it without solid use cases, because it can mean a lot of API surface, and IME any such reflection is typically mostly abused in practice. Making it easy to look into the internals of modules invites breaking encapsulation, for example. It can also lead to unnecessary confusion, as your very example demonstrates. ;)

I'm OK adding things incrementally. There's pain in feature-testing though, so maybe it's better to do a one-shot thing.

I'm not a fan of saying "you can't have feature X because you may misuse it". If the fallback is "implement a WebAssembly disassember" then we've failed developers, even if what they do is a "misuse".

The only constraints I really care about for reflection are:

  • not hindering future expansions to WebAssembly
  • not exposing features which are expensive to expose

For example, if we give access to memory in a manner that precludes having multiple memories then we've done ourselves a disservice. So IMO we should be careful in expanding reflection information in a forward-compatible manner, but we shouldn't hold back information which is trivial to provide.

@lukewagner
Copy link
Member

Yeah, agreed that the JSON-style objects would be a lot more pleasant to use then a bunch of new APIs for building types.

The imports method would return frozen objects of this form, so that they can be suitably cached and shared.

Object identity would still be observable, so probably have to actually mandate (per-global) memoization.

Another thing to consider going forward is reference types. Presumably ValueType (in your above desc) would gain a fifth Object option for representing compound types and for references there would be some {kind:'ref', referent:x} where x would be the description of a struct/array/etc type. And, of course, the fields of a struct type can be refs to struct types so this pretty much forces memoization. (Fortunately, we don't have to figure all this stuff out now to reflect wasm v.1; it'd just be good to have a Plan of Record here for GC types.)

@dschuff
Copy link
Member

dschuff commented Apr 21, 2017

I think this (especially e.g. function and table information) would definitely be something I'd want for systems tooling (linkers, loaders, interop layers, etc). Machine code has no native way of expressing these kind of things (e.g. symbols) so we have lots of different container and metadata types. Wasm has enough that we can re-use a lot of the primitives (although obviously you still do need some extras e.g. relocations). This is easy enough for offline tools such as wabt and LLVM but it seems like we'd want it online too (just as ELF headers are mapped into memory on Linux systems).

I also agree that we shouldn't be too prescriptive about exposing this kind of metadata. In particular I do think that "metadata" is a better way to think about it than "reflection"; the use cases are less like how reflection is used in a language and more like how metadata is used in language implementations. I don't see how you'd make useful systems software possible without exposing some functionality you wouldn't want just everyone using.

@rossberg
Copy link
Member

@lukewagner, agreed on memoization and planning for other types. Fortunately, the AST scheme should be relatively straightforward to extend along the direction you sketched.

@dschuff, agreed, but it seems to me that the examples you list don't require knowledge about a module beyond its imports and exports and their respective types (and that arguably is by design, of course). For example, I'm not sure I understand what notion of relocation information could be meaningfully exposed to the JS environment. Can you elaborate?

@jfbastien
Copy link
Member Author

Do we have consensus for something along the lines @rossberg-chromium proposed?

@sunfishcode
Copy link
Member

The proposal above now lives here and is now at phase 3. If there are any questions or comments for that proposal, please file an issue in the proposal repo issue tracker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants