Skip to content

Add support for global variables #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
noxabellus opened this issue Aug 20, 2024 · 8 comments · Fixed by #9
Closed

Add support for global variables #8

noxabellus opened this issue Aug 20, 2024 · 8 comments · Fixed by #9

Comments

@noxabellus
Copy link
Contributor

This would be really handy to have. A similar interface as exportFn could be used, if necessary. I'm fairly new to WASM, but it looks like the exported variables as is are pointers into the memory buffer. So the generated glue code could use type info to build a property accessor or something? If you could provide some guidance on this, I'd be willing to give a go at implementing it.

@scottredig
Copy link
Owner

Hey, I'm trying to understand the request here. Can you please provide an example of what usage would look like?

@noxabellus
Copy link
Contributor Author

Sure. In the zig code, writing something like

pub export var foo: bool = false;

Gets you a WebAssembly.Global object in the instance.exports; But accessing its value gives you something like 1058904 rather than a boolean value, which I assume is a pointer to the actual bool inside the module memory.

Currently, I'm doing something like this, after instantiation the wasm module and passing it to the zjb generated glue.

for (const key in zjb.exports) {
	const exp = zjb.exports[key];
	// check here is unnecessary for now since zjb.exports are always functions
	if (exp instanceof Function) Game.prototype[key] = exp;
	// but future code might add something like:
	else Object.defineProperty(game, key, { get: () => zjb.exports[key], set: v => zjb.exports[key] = v });
}

Sticking the functions into the prototype like that gives me easy access to them in my javascript-side code, ie functions like onRender etc. It would be nice to be able to access globals this way as well, placing get/set property descriptors for them into the instance of the Game object I'm affecting the prototype of here.

The glue code for variables then would need to do a similar thing as where you create array views, creating a view of the memory and then loading/storing values into it. This could be done either in generated get_global_foo and set_global_foo functions (though that would complicate my second-layer glue code shown above) or as property descriptors (mirroring what I showed).

@scottredig
Copy link
Owner

Ah, yeah, I see the vision here. Here's what the steps to implement this look like:

First verify the export behavior. Specifically, try to write some javascript code (probably just play around with the console in dev tools) which gets the value from the exports, and then uses a DataView of the WASM memory to pull out or set the value. Note that wasm is always little endian, so basically of the methods on dataview require a true as their last argument. What you're saying about how you think it works makes sense, but I haven't worked with those types of exports, so worth checking.

From that point, there are broadly two parts which need to be done to work together: The zjb library code, and then code that writes the javascript.

Between the two layers, all of the information is encoded in the import/export name. You'll need to come up with a prefix for the exported variables (functions use zjb_fn, so probably zjb_g?) and a format for specifying the type (the one used for functions won't work, since it doesn't care about which types number are), probably just use the zig name (eg, u8).

For the zjb library side:
There's a wrinkle here I think. Ideally it would have an api like exportFn. So, take the variable, and it exports it based on the type. Unfortunately, I'm not sure how @export taking a declaration of a global works. I assume it means that you can't just pass a variable to the function, because that wouldn't be the correct declaration to export. Without trying, I think you can use @field to get around this limitation. Eg,

pub fn exportGlobal(parent: type, comptime name: []const u8) void {
    const export_name = ...;
    @export(@field(parent, name), .{ .name = export_name, });
}

then usage would look like:

var my_var: u8 = 0;
comptime {
    zjb.exportGlobal(@Self(), "my_var");
}

If none of that works, just make the api return a name based on the desired name and returns a std.builtin.ExportOptions, so the user can call @export in a comptime block themselves. This isn't preferred if it can be helped, though.

On the codegen side:

  • The code pulls out of the wasm file only the things it needs. This is the relevant bit of the WASM spec: https://webassembly.github.io/spec/core/binary/modules.html#binary-exportsec
  • And here is the code reading the wasm file that sees the exported functions: https://github.com/scottredig/zig-javascript-bridge/blob/main/src/generate_js.zig#L72 The 0 is the id from the wasm spec for functions, globals use a different value. Use the chosen prefix and build a new arraylist of those exports. Since we're not interacting with the wasm module from javascript, but instead the javascript wrapper of the wasm module, the name should be all the information you need about the export.
  • From there you'll need to generate the javascript. It should go about here. Your property descriptor example looks good, so after the object of exports is defined, loop through the global exports, decode the name, and add a property for that name. The getter and setters should use the relevant dataview methods based on the exported type.
  • One tricky bit is that every access (read and write) will need to validate the DataView. If wasm memory allocation happens, the view gets invalidated and the length becomes zero. So there should be a dataview that is a member of the overall zjb object (we need it for string import support anyways), and add a method which recreates the dataview if the memory has grown.

Add some usage to the example, so we can test it out.

Thanks for taking an interest in this, good luck, and let me know if you have any questions or hit anything tricky.

@noxabellus
Copy link
Contributor Author

noxabellus commented Aug 21, 2024

Okay so I've dug in a bit here and run into a couple of issues.

  1. Sadly, @field doesn't work. A workaround I found is to pass the address of the variable to exportGlobal. This requires an extra load in the glue's getter and setter, to chase down the indirection. Let me know what you think of that, seems acceptable to me.
  2. I'm not sure how to validate the DataView if I keep it around. How can I tell if memory has grown? For now I'm just recreating it every time, though I feel that isn't really an acceptable solution.

Edit: Ack, I just re-read your post and saw you said "the length becomes zero." I suppose thats my answer for 2.

@scottredig
Copy link
Owner

It's pretty silly that @export(@This().foo, .{...}) works, but @export(@field(@This(), "foo"), .{...}) doesn't.

The double pointer solution is a bit hacky, and less than ideal. One of the design constraints of this project is that things shouldn't be less efficient that writing the wasm glue code yourself. The only violations of this are global() and constString(), which have a one time increase in cost but after that are just as efficient. They're also very convenient, which lets them push that boundary a little bit.

However, I've done some digging and found ziglang/zig#14911, which is accepted. So this restriction is temporary, and the API should be what we want it to be after this restriction is lifted. Additionally, you should be able to follow the first pointer in the initialization code, so after a very slightly worse startup, the usage code will be just as efficient. Given that we want to eventually take a pointer due to the changes to @export, and that's the type we currently need to export anyways, require the value passed into zjb.exportGlobal to be a pointer.

and yeah, either look at the DataView's ArrayBuffer`s length, or keep track of the length of the wasm instance's memory and check if that value has changed.

@noxabellus
Copy link
Contributor Author

I think there's just one last thing to iron out here

Additionally, you should be able to follow the first pointer in the initialization code

This will require having access to the instantiated instance at initialization, because we'll need to access the exported address. A similar situation arises when storing a DataView, since we need the instance's memory to initialize it.

I think there's a few possible solutions here:

  1. Change the constructor to take the instance. I don't see how this can work given funky js this semantics coming into play with the imports object
  2. Change the glue code to perform module instantiation itself. This will require moving the setup code out of the constructor because the process will have to be async
  3. Add an additional function to call after the constructor call, which sets up these values that need access to the instance
  4. Use lazy initialization of globals' pointer and of the shared DataView. This would incur a small additional overhead to each access of these, as we have to check for 1 - 2 nulls each time

Though I will admit its a significant change to the api, personally I think 2 is best

  • It seems unlikely that wasm generated this way could be correctly instantiated without zjb
  • In case the user needs direct access, we can simply expose the module as a property after we've instantiated it
  • The construction happening with a static async method instead of new is similar to existing wasm apis
  • Has the most minimal overhead
  • Has a side benefit in that it streamlines the overall initialization process a bit
  • 3 has similar overhead but it seems prone to user error as its not a common pattern in js, in fact I'd just about call it an anti-pattern
  • The next best option in my opinion is 4 but the performance consideration seems similar to the one you were just making with the extra dereference

Maybe there's another option I'm overlooking here, but thats my 2¢. Let me know what you think.

noxabellus added a commit to noxabellus/zig-javascript-bridge that referenced this issue Aug 22, 2024
current solution is less than optimal imo, see scottredig#8 (comment)
@scottredig
Copy link
Owner

So there's already a post initialization step that's required:

zjb.instance = results.instance;

Change that to:

zjb.setInstance(results.instance);

The main reason to avoid option 2 is that it moves zjb from being a tool that can be used with wasm, to being more of a framework that dictates how you must work with wasm. That's a downgrade in compatibility with other requirements users may have.

The other option which could be considered would be setting all of those values/functions to perform initialization, and overriding them with the real implementation, but there's no reason for that complexity when you can just do setInstance.

Also, a thought I had is that having a publicly usable dataView() function which returns a cached DataView is useful anyways, so don't hide that functionality away.

@noxabellus
Copy link
Contributor Author

The main reason to avoid option 2 is that it moves zjb from being a tool that can be used with wasm, to being more of a framework that dictates how you must work with wasm. That's a downgrade in compatibility with other requirements users may have.

As I said, its hard for me to imagine this facility being used any other way. Maybe you have plans I can't see, but from what I can tell the wasm code generated inherently depends on zjb in that the module instantiation requires an import object exactly matching zjb's, and accessing the exports with the names specified in zig requires a wrapper exactly matching zjb's export object. Besides simple things like sticking extra imports into the object before passing it to the module instantiation, which is a case that could be handled by allowing the user to pass extra imports to the zjb call; or customizing the memory used, which is another case that could be handled with configuration passed to the call; it seems to me that any work-around of the dependency would require not using the generated glue code at all even as it stands today. I don't think it'd actually introduce extra coupling, in practice.

Anyhow, I'll leave that with you and go with setInstance for now. Will submit the PR soon! 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants