-
Notifications
You must be signed in to change notification settings - Fork 74
Field Accessors #238
Comments
Thanks for the write up! It should be fairly simple to measure the code size benefit of switching from two indices to one index on accessor instructions. Since the precise semantics of this change would depend on what core type system we settle on and also because we have agreed not to consider field privacy in determining that core type system, I suggest that we consider this change after we have settled the type system question. |
Um, none of the short-term motivations mention privacy, and I mentioned that this would probably be workable for structural types. Also, I thought the plan everyone just agreed to was to not the settle the type-system question for quite a while. So now I'm confused; I had thought I had already taken all of yesterday's discussion into account in my post. |
Afaict, this is just an additional (derived) index space for fields. This would actually be useful for an interpreter to have only one LEB index in |
Sorry @RossTate, I stated my reasoning imprecisely and too strongly. I agree that this is orthogonal to the type system question and I agree this is worth considering for the MVP. I would like to consider this after settling the type system question only because choosing a type system is the most important design problem right now and does not depend on this immediate choice, nor on the longer-term consequences of this choice. If anyone wants to investigate this in parallel to ongoing type system work, that is fine by me. |
Thanks for clarifying, @tlively! I see your reasoning now. My concern is that my understanding of yesterday is that we were not going to be choosing a nominal vs. structural type system for a while, so I'm worried about deferring improvements until after that choice is made (unless the plan is "officially" changed). So I figured that we could incorporate this into the "official" nominal system while we're creating that anew anyways. This, along with the accompanying extension, were the two biggest core deltas that came to mind (beyond the switch to nominal), which is why I mentioned them now. @titzer's summary seems accurate. It occurs to me there is one more observation to make regarding the accompanying extension. If we were to only allow "extending" types to add more fields—not allowing them to redeclare/refine fields—then every field is defined exactly once. This would make the field-accessor list essentially just the field list (plus imported fields). A nice thing about this is that it means a debugger could use the dynamic type of the object to uniquely determine the names of its fields (specified in the names section of the unique module defining the field). While, upon reflection, the other pros and cons of field accessors seem to apply equally well to structural and nominal types, this pro seems specific to nominal types where we have an obvious "owning" module of the field definition whose name section we should clearly use. (Note that I wouldn't consider that a significant enough reason to favor nominal over structural; I'm just trying to be transparent/complete.) |
FWIW, removing redundant type index annotations is orthogonal. I believe I mentioned before that I'd like to remove them on GC instructions, per discussions like WebAssembly/function-references#27 and consistent with what already happened in WebAssembly/reference-types#99 and WebAssembly/function-references#31. I have delayed making this change for GC instructions, though, since V8 folks expressed their desire to not make breaking changes for the time being. @jakobkummerow, is that still the current status? |
@rossberg: I think right now is a particularly good time for making such breaking changes, as everyone is getting ready to implement new things anyway (nominal and iso-recursive+explicit-inheritance types).
@RossTate: It would be good to measure this hypothesis on production-scale examples. Since we use LEB encoding, two smaller numbers don't necessarily (though certainly sometimes) take more space in the binary than a single bigger number. |
Ah, good point. Is there a public collection of production-scale binaries to examine? Of course, there are motivations besides smaller size that are worth discussing in parallel. I remember @jakobkummerow and @tlively were expressing interest in names for fields for debugging, so I'm interested to hear your thoughts on whether this helps with that (especially with respect to the ability to report the name based on the dynamic type of the reference). As another topic, it sounds like @titzer agrees this would likely be helpful for interpreters; I'm curious if @jakobkummerow or other engine implementers think it might help with faster compilation. I figured engines could have an array that simply maps a field-accessor index to its corresponding offset within the object, and suspected that would be simpler than the current type-dependent mapping-to-offset calculation. Then again, maybe this is small in the grand scale of things. There's also the long-term considerations, but my understanding is that we're trying to focus discussions on more immediate concerns. One question I have is has anyone thought of a reason this would be substantially worse than the current approach? It would be good to know of potential issues sooner rather than later. |
The issue with interpreters is that if the index to This is why I posted on #27. Removing information from bytecodes that can only be derived through validation complicates interpreter designs. A single field index that is in the index space of all fields declared (or perhaps imported, that's a post-MVP discussion) in a module is a better choice for an interpreter design, because it'd then be easy to have a single array of field metadata hanging off the module representation. |
I did an experiment on the Dart barista3 benchmark, where I tried out three different encodings for
The results are as follows:
The new encoding does give a roughly 2% size reduction for the uncompressed module. But when the module is compressed, the picture is reversed: the new encoding is bigger than the current one, presumably due to the irregularity of the combined encoding compared to the separate one. With a modern compression format, the difference is even more pronounced, with even the "Added fields" version being about 0.5% bigger than the current encoding. |
Wow, thanks @askeksa-google for performing the experiment! That's interesting how compression cancels out the savings. I had hoped the numbers would give a clear win one way or the other, as we've found numbers more quickly bring us to group consensus. but it seems it's about a wash here. Given that we're now building something for nominal types, I'd still prefer for us to build on this over the type+index approach for the various other advantages that have been discussed above (though I'd still take that over the index-only approach). I'm going to throw one more long-term advantage in the mix as well. We know that a number of languages will benefit from support for nested structures. There's also the special case of #222. The Post-MVP outlines support for nested structures. But this design relies on using engine-supported interior references and a sequence of instructions to actually get at the contents of these nested structures. While interior references are useful, they're also quite challenging, and applications can implement (wide) interior references with field members anyways (though this has caveats with multithreading), so I'm concerned about nested structures depending on engine-supported interior references. With field accessors, when you specify a structure containing a nested structure, you get a field accessor that, when given access to the outer struct, gives you access to the inner struct. We can then provide a way to declare a field accessor that is the composition of two field accessors, so that if one goes from Outer to Inner and the other goes from Inner to Now, besides supporting languages like C#/Go, there's another benefit to supporting nested structures and field accessors. For (presumably) the Post-MVP, I've been investigating how to support separate compilation for Java-like languages without essentially baking in direct support for it into the engine (or relying on dynamically generating wasm modules). I've found that the "a subclass's definition specifies all the fields of its superclass" doesn't work very well. On the other hand, if we have "C extends B extends A", then modeling an object as a sequence of nested structures "(A's new fields) followed by (B's new fields) followed by (C's new fields)" we can define C's structure without depending on the full details of A or B's structures. The module for C can just import the (fixed-sized) structures for A and B's (new) fields, along with the field accessors of A and B's (public) fields (punting some details I've worked out on how to make C not have to know which of A or B declares the field). Looking beyond linear inheritance, this approach also extends well to multiple inheritance, traits, mixins, and the like. So, if all's roughly even numbers-wise, I'd prefer this approach for the new nominal system because, in addition to (I suspect) easier compilation/interpretation and better debugging support, my investigations so far suggest that it also lays better foundations for various Post-MVP extensions. |
For completeness, the index-only version added to the table (first three rows are the same as above):
Saves 3.5% uncompressed and 5% compressed. |
Ah, that is compelling. For completeness, I believe there is a way to solve the multiple-bounds problem for the index-only approach. What you have to do is—rather than a bunch of separate imports of types—import an entire hierarchy, where as part of that import you specify a DAG/tree of (imported) types and subtypings that must be in the hierarchy. The caveat is that, when checking that an exported hierarchy+types matches an imported hierarchy+types, in addition to checking the given subtypings one must also check that joins and meets are preserved as well (which is trivial to do for DAGs/trees). Doing so essentially enables the solution that iTalX used for this problem. Of course, it's worth noting that this does not address other problems with index-only (such as for interpreters) nor solve other problems that field accessors manage to solve (such as separate compilation and nested structs), but it is an option that apparently takes less binary space, and design is about considering tradeoffs. |
I had a chance to consider the hierarchy approach in more depth and was able to make it much simpler than what I described above. That said, I can't figure out how to make the index-only approach to work well with separate compilation. Separate compilation for many object-oriented languages fundamentally relies on being able to build the layout of an object by composing portions of the object that are specified across multiple modules. The "index field from start of object" approach just doesn't accommodate that. But I think I found a way to synthesize some of the variations above. When you define a new type, you specify just its new fields (relative to the type it extends, if any). To get/set a field, you specify the type and the index of its new field. So, if Does that make sense? |
There is indeed a fundamental incompatibility between fast(est possible) field accesses and separate compilation (or more specifically: unknown supertypes), and we'll have to decide eventually which compromise we want to pick. The fastest machine code a field load like
When generating this code, the compiler must obviously know the full layout of
That'd be 3 (chained!) loads instead of 1. I'm not sure this is exactly the code we'd end up with; it's meant to be a representative illustration. I doubt it could be more efficient than this. I also doubt that we'd be happy with a design where every field access has to pay this cost (but that's just my gut feeling). |
I agree that we don't want 3 loads for a field access (especially chained ones!). But I don't believe what I'm suggesting here makes that necessary. Let me break it down into cases for First, if you know the full supertype chain for So the real consideration is whether the flexibility granted by this design can be implemented efficiently enough that people would want to use it. Remember that the alternatives are not great. Either you compile the subclass according to a fixed representation of its superclass, or you JIT your own WebAssembly module at load time. And in either case, the module implementing the superclass has to export all the internals of the superclass module just so that you can use its indexing scheme and the linker can check you used it correctly. (Note that both of these are fine for the MVP, as we might not even support separate compilation at all for the MVP, but I'd like an approach that at least is compatible with a good Post-MVP option for separate compilation.) So people might be fine with some pay-as-you-go performance hit for the significant improvement in flexibility—the key word being some (we already have pay-as-you-go per the observation above). Now let's consider the worst-case scenario. Unfortunately, I'm not sure how you arrived at your first "load type info" instruction, so I can't speak to that. But here's what I would expect. When you import an (extensible/fixed-size) type like The first thing to consider is that the first load is from the module instance. Presumably this is in the cache (even the L1 cache?). As such, this first load is likely to be (very) fast. The second thing to consider is that the first load is of immutable data. Thus it can be amortized across loads and writes of various fields of extensions of Lastly, I'm still not convinced that WebAssembly can efficiently support even basic ecosystems without some form of instance-specific compilation at the least in the optimizing tier, whether it's inlining, "fusion" (a la Interface Types), or—in this case—filling in immediates. But even putting that aside, by the above reasoning I suspect people would find the pay-as-you-go costs to be pretty minor and well worth the flexibility granted, especially with respect to the alternatives. (And, to repeat, it might be that we decide the MVP does not support any separate compilation whatsoever; this design just leaves a path forward for separate compilation in the Post-MVP, with no performance cost to MVP programs, and with other advantages for specifically the MVP discussed above.) |
I see two implementation options that both lead to three chained loads. Hardware caching and/or compiler-side de-duping can certainly help; whether it'll help enough to make this viable/desirable will likely have to be determined experimentally at some point. I didn't mean to imply that the Field Accessors idea in itself requires any of that; I was commenting specifically on separate compilation, with or without Field Accessors.
Yes, I'm not convinced of that either; giving up on separate/ahead-of-instantiation-time optimized compilation is one of the possible compromises we could choose. |
Ah, thanks for the clarifications. But now I have a question about how things are currently implemented. I had thought that WebAssembly modules were designed so that things like globals and imports could be stuck directly in the instance rather than indirectly through arrays (in which case supertype sizes would be in the instance). Is that not the case then? Do gets of globals and calls to imported functions require 2 chained loads in the worst case? |
The instance object (like the module object) has a fixed size, so everything whose length is specified by the module is an array: globals, functions, tables, types, you name it. At least we can stick the pointers to those arrays directly into the instance object :-) Direct-calling an imported function takes 4 loads (two pairs of two chained loads, both best and worst case). Indirect-calling takes 7 (longest chain 3) or 9 (longest chain 4) loads depending on whether the table index is |
Oh wow, that's grim. Thanks for the improved understanding (and for clarifying load-count vs. chain-count). |
Keep in mind that globals can be imported/exported, so their storage needs to be indirected from instances. I don't understand why |
9 loads worst case, wow! The cost of just even a small bit of abstraction/indirection is intense! I presume part of this is because we must accommodate an imported function to be changed at any moment? What if we had a way to "seal" a Wasm module after instantiation? |
@titzer Pseudo-code for
The conditional A lot of this complexity is due to the fact that V8 is both a Wasm and a JS engine (and a funcref can come from either world); in a pure-Wasm engine this could be a lot simpler. |
@aardappel Imported functions can't be changed, but tables can. |
@jakobkummerow Ah, yes, I hadn't considered JS functions flowing into a wasm call ref and then presumably escaping again via
The second one implies doing work for every reference (not statically known not to be potentially a JSFunction) that crosses between JS and Wasm, either at a function call, or being written or read from a table. Thanks for the info, though I did drag this a bit off topic. |
re 1: Yes, putting another pointer (or two) into JSFunction could save a load (or maybe two?), but given all the efforts we've put into making JSFunctions as small as possible (because there are typically many of them), I'm hesitant to make them bigger. |
My assumption for function references always was approach (2), i.e., coerce at the boundary, like you do for imports. With that, |
Is anyone interested in investigating this option further? If not, we can close this issue. |
I don't think object models will work exactly along the lines in this thread, but I can see a use case for importing field accesses. That said, there are several issues in this thread and I think the |
Now that we're in phase 3, the design is largely stable, so we won't be adding field accessors to the MVP. If it would make sense as a post-MVP feature, PRs it to the post-MVP doc would be very welcome. |
This is a suggestion for a different way to design the instructions for getting/setting fields, which for lack of a better term I'll refer to as "field accessors".
Design
After processing the types section of the module, we'll get two lists: the list of types, and the list of field accessors. When processing a (nominal) type in the types section, for every field of the type a corresponding field accessor is automatically added to the field-accessor list. The type of the field accessor depends on the qualities of the field, e.g. what is its value type, and is the field writable? The "receiver" type of the field accessor is whatever (nominal) type it was generated from.
So, if one defines a nominal type, say
$Foo
, with a mutablei32
field followed by an immutablei64
field, then there will also be two field accessors, say$x
and$y
, where the type of$x
is "receiver type is$Foo
, value type isi32
, and mutability ismutable
" and the type of$y
is "receiver type is$Foo
, value type isi64
, and mutability isimmutable
". The text format could provider room for declaring the field-accessor names$x
and$y
, which get bound to their corresponding indices in the generated field-accessor list.Then the instructions for getting and setting a field simply include a field-accessor index, rather than both a type and an index. The type of the corresponding field accessor determines the receiver type that must be on the stack and the value type that either is returned (for getting) or must be on the stack (for setting). For setting, the field accessor must also be mutable.
To clarify, this is not the same as the field members described in the Post-MVP. Field members in the Post-MVP are essentially the first-class variant on field accessors.
Motivation
Short-term reasons to use field accessors:
Longer-term reasons to use field accessors:
Accompanying Extension
Field accessors would also facilitate another extension or variation. When one defines a nominal type that extends another nominal type, we could avoid repeating the fields of the other nominal type and simply state the new fields of the new type. In the short term, this would reduce the size of the types section. In the long term, this would facilitate separate compilation of subclasses (particularly in the presence of private fields of the superclass).
Whether we actually want this accompanying extension should be discussed later and somewhere else; I just mention it here as another motivation for field accessors.
Transfer
Although I came up with this for nominal types, the idea likely transfers to structural types.
The text was updated successfully, but these errors were encountered: