Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Illustration of Staged Compilation for Deferred Loading #229

Closed
RossTate opened this issue Jul 1, 2021 · 3 comments
Closed

Illustration of Staged Compilation for Deferred Loading #229

RossTate opened this issue Jul 1, 2021 · 3 comments

Comments

@RossTate
Copy link
Contributor

RossTate commented Jul 1, 2021

Yesterday there seemed to be interest in supporting deferred loading, though uncertainty as to what degree of deferred loading should be supported and when. #228 is starting discussion on the matter of scope and priorities, and for that discussion I thought it might be helpful to illustrate what support for more advanced deferred loading might look like, i.e. more advanced techniques to cut down on how much has to transmitted over the network and on how much has to be compiled before the program can start running. In particular, many people mentioned staged compilation, and so this is an illustration of staged compilation.

WebAssembly Stages

First, a wasm module could be extended to have multiple stages (rather than just the one it presently has). A declaration like (stage "external name" $internal_index $stage_dependency*) would declare a stage that would externally be identified by the string "external name", would internally be identified by $internal_index (with 0 being the index for the initial stage that's already part of every wasm module), and which depends on the list of stages (with smaller indices) in $stage_dependency* (plus the initial stage)—meaning the stage cannot be advanced until those stages have advanced.

Second, imports/exports/globals/funcs and so on could specify a stage at which point they become required (in the case of imports) or available (in the case of exports/globals/funcs)—when no stage is provided, the default is the initial stage 0.
The type-checker would make sure that initializers of globals and implementations of funcs only access imports/globals/funcs of the stage corresponding to that global/func or to earlier stages (according to the declared stage dependencies).

Third, we could add the instruction if_advanced $stage+ instr1* else instr2* end. The instruction checks to see if every stage in $stage+ has been advanced. If so, it executes instr1*. Because the stages in $stage+ are guaranteed to have been advance, the type-checker permits the instructions instr1* to access imports/globals/funcs belonging to any stage in $stage+ (and earlier stages). If any stage in $stage+ has not advanced, it executes instr2*.

Example

Consider the following classes:

class ArrayList<T> { // T will be erased
    public ArrayList<T>();
    ...
    public int size();
    ...
    public T get(int index);
    ...
    public void add(T elem);
    ...
}

class Foo {
    private ArrayList<Bar> bars;
    public Foo() { bars = null; }
    public void track(Bar bar) {
        ArrayList<Bar> list = bars;
        if (list == null)
            list = bars = new ArrayList<Bar>();
        list.add(bar);
    }
    public int sumTracked() {
        int sum = 0;
        ArrayList<Bar> list = bars;
        if (list == null) return sum;
        for (int i = list.size(); --i >= 0; )
            sum += list.get(i).currentValue;
        return sum;
    }
}

class Bar {
    ...
    public int currentValue;
    ...
}

For compiling this, the convention we'll use is that the loader is responsible for declaring types (of classes) and for constructing (fresh) rtts for those types/classes. The loader is also responsible for providing hooks for loading classes when the need arises (possibly suspending computation until the class is loaded). If a class method relies on some aspect of an imported class that is not available until after the class has been loaded, the module will have a stage corresponding to the class being loaded, and the (relevant part of the) method will be guarded by if_advanced of that stage, calling the hook provided by the loader should the check fail and resuming after the load completes.

With that convention established, we can compile Foo to the following wasm module with stages (and declare-type/define-type (#224) and field references):

(module
    (stage "ArrayList" $ArrayList_loaded) ;; no dependencies
    (stage "Bar" $Bar_loaded) ;; no dependencies

    ;; loader is responsible for declaring types and for allocating rtts
    ;; import relevant (declared) types and (fresh) rtts, giving definitions for the declared types specific to Foo
    (type (import "ArrayList" "type") $ArrayList) ;; note that we've erased the type argument
    (type (import "Bar" "type") $Bar)
    (global $Bar_rtt (import "Bar" "rtt") (rtt $Bar))
    (define-type (import "Foo" "type") $Foo (struct
        (field $vtable (ref $FooVTable))
        (field $bars mut (ref null $ArrayList))
    ))
    (global $Foo_rtt (import "Foo" "rtt") (rtt $Foo))

    ;; define the v-table and export the accessors to Foo's public methods (but not Foo's private fields)
    (declare-type (export "V-Table") $Foo_VTable) ;; only declaration is exported - no one should depend on exact order of methods
    (define-type $Foo_VTable (struct
        (field $track (ref ([(ref $Foo) (ref null $Bar)] -> []))
        (field $sumTracked (ref ([(ref $Foo)] -> [i32]))
    ))
    (global (export "v-table") (field-ref $Foo (ref $Foo_VTable)) (ref.field $Foo $vtable))
    (global (export "track") (field-ref $Foo_VTable (ref ([(ref $Foo) (ref null $Bar)] -> [])) (ref.field $Foo_VTable $track))
    (global (export "sumTracked") (field-ref $Foo_VTable $sumTracked (ref ([(ref $Foo)] -> [i32])) (ref.field $Foo_Vtable $sumTracked))

    ;; initialize the v-table, referring to functions defined below
    (global $Foo_vtable (ref $Foo_VTable) (struct.new (ref.func $Foo.track) (ref.func $Foo.sumTracked))) ;; no rtt because never cast

    ;; export the public constructor
    (func $new_Foo (export "new") (result (ref $Foo))
        (struct.new_with_rtt (global.get $Foo_rtt) (global.get $Foo_vtable) (ref.null $ArrayList))
    )

    ;; import hooks into the dynamic loader
    (func $load_ArrayList (import "ArrayList" "load"))
    (func $load_Bar (import "Bar" "load"))

    ;; import staged dependencies on ArrayList (but not all of ArrayList)
    (type $ArrayList_VTable (stage $ArrayList_loaded) (import "ArrayList" "V-Table") )
    (global $ArrayList_vtable (stage $ArrayList_loaded) (import "ArrayList" "v-table") (field-ref $ArrayList (ref $ArrayList_VTable)))
    (func $new_ArrayList (stage $ArrayList_loaded) (import "ArrayList" "new") (result (ref $ArrayList)))
    (global $ArrayList.size (stage $ArrayList_loaded) (import "ArrayList" "size") (field-ref $ArrayList_VTable (ref ([(ref $ArrayList)] -> [i32]))))
    (global $ArrayList.get (stage $ArrayList_loaded) (import "ArrayList" "get") (field-ref $ArrayList_VTable (ref ([(ref $ArrayList) i32] -> [(ref null any)]))))
    (global $ArrayList.add (stage $ArrayList_loaded) (import "ArrayList" "add") (field-ref $ArrayList_VTable (ref ([(ref $ArrayList) (ref null any)] -> []))))

    ;; $Foo.track needs to be part of the initial stage in order to go into the v-table
    (func $Foo.track (param $this (ref $Foo)) (param $bar (ref null $Bar))
            (local $list (ref null $ArrayList)) (local $list_vtable (ref $ArrayList_vtable)) (local $bar (ref null any))
        ;; $Foo.track needs ArrayList to be loaded, but not Bar (even though it has to cast to Bar)
        (loop $retry
            (if_advanced $ArrayList_loaded
                (local.set $list (struct.get $bars (local.get $this)))
                (if (ref.is_null (local.get $list))
                    (local.set $list (call $new_ArrayList))
                    (struct.set $bars (local.get $list))
                ) ;; $list is now not null - bikeshedding how to get the type system to realize that
                (local.set $list_vtable (struct.get_ref (local.get $list) (global.get $ArrayList_vtable)))
                (local.set $bar (call_ref (local.get $list) (local.get $bar) (struct.get_ref (local.get $list_vtable) (global.get $ArrayList.add))))
                (return (ref.cast (local.get $bar) (global.get $Bar_rtt)))
            else
                (call $load_ArrayList) ;; get the loader to load up ArrayList (possibly suspending computation)
                (br $retry) ;; though not guaranteed by the type system, restarting the loop should enter the if_advanced now
            )
        )
    )

    ;; import staged dependencies on Bar
    (global $Bar.currentValue (stage $Bar_loaded) (import "Bar" "currentValue") (field-ref $Bar mut i32))

    (func $Foo.sumTracked (param $this (ref $Foo)) (result i32)
            (local $sum i32)
            (local $list (ref null $ArrayList))
            (local $list_vtable (ref $ArrayList_VTable))
            (local $list_get (ref ([(ref $ArrayList) i32] -> [(ref null any)])))
            (local $i i32)
            (local $bar (ref null $Bar))
        (local.set $sum (i32.const 0))
        (local.set $list (struct.get $bars (local.get $this)))
        (if (ref.is_null (local.get $list))
            (return (local.get $sum))
        ) ;; $list is not null now - bikeshedding how to get the type system to know this
        (loop $retry
            (if_advanced $ArrayList_loaded $Bar_loaded
                (local.set $list_vtable (struct.get_ref (local.get $list) (global.get $ArrayList_vtable)))
                (local.set $i (call_ref (local.get $list) (struct.get_ref (local.get $list_vtable) (global.get $ArrayList.size))))
                (local.set $list_get (struct.get_ref (local.get $list_vtable) (global.get $ArrayList.get)))
                (loop $for
                    (local.set $i (i32.sub (local.get $i) (i32.const 1)))
                    (if (i32.ge (local.get $i) (i32.const 0))
                        (return (local.get $sum))
                    )
                    (local.set $bar (ref.cast (call_ref (local.get $list) (local.get $i) (local.get $list_get)) (global.get $Bar_rtt)))
                    (local.set $sum (i32.add (local.get $sum) (struct.get_ref (local.get $bar) (global.get $Bar.currentValue))))
                    (br $for)
                )
            else
                (call $load_ArrayList)
                (call $load_Bar)
                (br $retry)
            )
        )
    )
)

JS-API Advancement

Given a staged module, the question is how does one advance the stages? Just as (currently) module instantiation has to be done by the embedder (e.g. the JS API), so does stage advancement.

instantiate could work just as before, but only the imports for the initial stage would be required and processed, and only the exports of the initial stage would be available afterwards.

We then add the method Instance.advance("stage name", importObj). This fails if any of the stages that the named stage depends on have not been advanced yet. It either fails or does nothing (bikeshedding which one) if the stage has already been advanced. Otherwise, the relevant imports of the named stage are processed, the appropriate initialization (e.g. of globals) is done, and afterwards the exports become available as members of the instance.

As an optimization/streamlining, instantiate could be extended to have an optional list of stages to advance in sequence (getting their imports from the single importObj). Similarly, advance could be overloaded to take a list of stages to advance in sequence. In both cases, if the list contains a stage name that is not used by the module or that has already been advanced, we simply move on to process the next stage name.

Example

Unfortunately I can't write up the JS code for this because the hooks are ideally implemented using the JS API for wasm-async/await interop that's in concurrent development. So I'll describe it.

The loader maintains three dictionaries all keyed by class names:

  • instances : the value is the module instance for the (loaded) key class
  • dependencies: the value is a set of class names that depend on the key class
  • importObj: the value is a dictionary of the things that can (currently) be imported from the key class

When the loader loads a new class:

  1. It fetches the binary of the wasm module for the class along with a list of classes it depends on.
  2. For each of these classes the new class depends on:
    1. The loader updates dependencies to register the dependency.
    2. If importObj has no value for the required class, the loader adds a new mapping to the dictionary with keys
      • "type" mapping to a newly declared (but not defined) type
      • "rtt" mapping to a fresh rtt for that newly declared type
      • "load" mapping to a function that, when called, returns if the required class is already in `instances, and otherwise creates a promise to load the required class, then causes the wasm computation to suspend until that promise resolves
  3. The loader also makes sure such a mapping exists for the importObj (though the "load" function will never get called).
  4. The loader instantiates the wasm module for the new class with importObj as the imports and providing the list of class names in dependencies that has a corresponding instance in instances as the list of stages to advance (since they've already been loaded, their corresponding stages can be advanced immediately).
  5. The newly created instance is added to instances.
  6. All the exports of the newly created instance are added to importObj.
  7. For each class that depends on the newly loaded class, fetch the corresponding instance from instances and call advance with the new class's name as the stage name and with importObj as the imports.

In the above, I didn't discuss how to deal with static initializers and with subclasses. Those each become "immediate" dependencies that must be loaded prior to the given class. Whereas dependencies can be cyclic (and thereby support mutual recursion with no additional effort), immediate dependencies cannot be (unless you want to add finer-grained stages).

To start the program, you load your main class and call its main export (with the appropriate wrapper provided by the JS API to enable suspending computations when more loading is needed). If you want more eager loading, you can list classes as immediate dependencies rather than plain dependencies. Or you could pretty easily adapt this so that one wasm module provides multiple classes and change from class-granularity loading to package-granularity loading or the like.

Implementation

Validation just requires associating a stage with each global/func/... and adding a stage-availability check to relevant instructions, which is easily implemented in amortized-constant time.

Due to the monotonic nature of stage advancement, execution has clear opportunities for jitting. Once a stage has advanced, relevant if_advanced checks can be removed. Furthermore, compilation of the code inside if_advanced can be deferred until afterwards, both making for faster load times and enabling specialization to the provided imports. This specialization could be particularly useful for field-reference imports, which would more closely recreate the compilation strategy and performance characteristics of well-established virtual machines like the JVM.

But it is also worth noting that jitting is not necessary. All this can be implemented by just having a boolean for each stage, initially false and made true when the stage is advanced. if_advanced simply checks the relevant booleans. This design ensures that the layout of the entire instance object, including fields relating to staged globals/funcs/..., can be determine upon initial instantiation. Some of those fields will not be initialized until the relevant stage is advanced, and the type system and boolean checks ensure that fields are never accessed until after they are initialized.

Conclusion

Although this does not necessarily handle some really advanced cases of deferred loading (like the presence of mutually recursive initialization of static fields), hopefully this at least illustrates that relatively easy to implement (and relatively simple to spec) extensions can take us pretty far (albeit relying on a concurrent proposal for suspending computation without manual stack serialization/deserialization). Hopefully this also illustrates that all this can be done without baking a loading policy into WebAssembly—there is easily enough flexibility and expressiveness for programs (using the JS API or other embedder-specific constructs for module instantiation and the like) to specify and implement their own policies.

@titzer
Copy link
Contributor

titzer commented Aug 5, 2021

Thanks for the thoughts on this, but I think this is a bit too far afield from the MVP for now, and I think the specific details are likely to change by the time we get to solving this problem.

Feel free to reopen if you feel this needs more discussion, but I think we focus on the MVP.

@titzer titzer closed this as completed Aug 5, 2021
@RossTate
Copy link
Contributor Author

RossTate commented Aug 5, 2021

I would classify this as a GitHub Discussion. I suggest we activate the feature and create a "Looking Ahead" or "Post-MVP" category.

@tlively
Copy link
Member

tlively commented Aug 5, 2021

I would support enabling the discussions feature, although I probably wouldn't personally spend too much time on post-MVP discussions. It might be good to briefly get the group's thoughts on this in a meeting.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants