Skip to content

Upgrade to witx 0.9 #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 3, 2021
Merged

Upgrade to witx 0.9 #4

merged 5 commits into from
Apr 3, 2021

Conversation

thunderseethe
Copy link

Details of the underlying changes to witx can be found here.

This was mainly a simplification, a lot of things that previously were a standalone type (flags, strings, enums, unions) are now syntax sugar for one of the underlying primitives (List, Variant, etc.).

Given String no longer exists it might make sense to remove the WasiString type. It would also be nice to recognize something like (list char) and convert it to the WasiString type when we see it. Currently a string converts to WasmArray<u32> which is not super usable. I haven't looked into what that would involve though

@jedisct1
Copy link
Owner

Thank you! This is great!

Strings producing WasmArray<u32> is a little bit concerning, though. Is the actual type of array elements lost?

@jedisct1
Copy link
Owner

No more flags is also quite sad. int/enums could be represented as an enum type, while flags can be combined together. It made a difference in Zig and could have been very useful when generating documentation as well :(

Cargo.toml Outdated
@@ -1,11 +1,11 @@
[package]
name = "as-witx"
version = "0.1.0"
version = "1.2.0"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds a bit premature. Maybe we should stick with 0.x for now :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, good catch! I updated it to 0.20 as originally intended

@jedisct1
Copy link
Owner

BTW, there's a 0.10 witx version round the corner, with a new ABI. Not sure how much changes that will require.
https://github.com/alexcrichton/WASI/tree/abi-next

@thunderseethe
Copy link
Author

I haven't looked at 0.10 at all, but given the explainer hasn't been modified at all I'm hopeful it'll have fewer breaking changes.

flags and strings still exist within witx files but they get de sugared into simpler defs under the hood via these equivalencies:

                                      string ≡ (list char)
                        (tuple <intertype>*) ≡ (record ("𝒊" <intertype>)*) for 𝒊=0,1,...
                             (flags <name>*) ≡ (record (field <name> bool)*)
                                        bool ≡ (variant (case "false") (case "true"))
                              (enum <name>*) ≡ (variant (case <name>)*)
                        (option <intertype>) ≡ (variant (case "none") (case "some" <intertype>))
                        (union <intertype>*) ≡ (variant (case "𝒊" <intertype>)*) for 𝒊=0,1,...
(expected <intertype>? (error <intertype>)?) ≡ (variant (case "ok" <intertype>?) (case "error" <intertype>?))

So flags should still work as expected, it's mainly problematic for string because a list of char is a list of unicode scalars as opposed to like a utf16 string we'd want to expose to people consuming the generated assembly.

I think it should be pretty easy to special case logic for (list char) and treating it as a WasmString in output though, it would just be a more substantial change then what the minimal changes I've made here to get code compiling on 0.9

@jedisct1
Copy link
Owner

The output now includes Rust (???) like code:

export declare function options_close(
    handle: options
): union<Result<(), crypto_errno>> /* error */;

Any idea where that could come from?

@thunderseethe
Copy link
Author

I do not, what's the input file you used to generate that output? I'll try to debug it

@jedisct1
Copy link
Owner

@thunderseethe
Copy link
Author

Thanks you that looks like the kind of thing that would be really handy to add as tests.

I took a look and it was due to me adding a line in ASType::name() that passed name onto variant causing it to render as union<...> instead of usize. I have removed that line

@jedisct1
Copy link
Owner

jedisct1 commented Mar 31, 2021

Unions can have name, there wouldn't be anything wrong with that. But Rust code is not a name :) Maybe a bug in witx?

The correct type everywhere in the example above should be crypto_errno. Besides this, functions are now missing half their parameters :)

Prior to witx 0.9, the returned values were the error type, and then pointers to be written to:

f(a, b, c, &d, &e) -> err

was represented as

(a, b, c) -> (err, d, e)

In witx 0.9, it looks like it became

(a, b, c) -> error { .err: err, .ok: (d, e) }

with "error" being a union ("variant") with two members, err and ok.

So, what the function actually returns is error.err and the missing parameters are in error.ok.

So complicated :(

@jedisct1
Copy link
Owner

So, func.results is still an array, but instead of being the list of returned values, it's an array with always one element. The actual array is hidden in that element.

@jedisct1
Copy link
Owner

Ah, no, the ok member can only contain a single value. So, this is not even the list of returned value, but apparently the first returned value.

f(a, b, c, &d, &e) -> err

is then represented in the witx object as

(a, b, c, &e) -> error { .err: err, .ok: d }

head -> | wall

@thunderseethe
Copy link
Author

Yeah my understanding at the moment is the expected variant is basically a translation of rust's Result to witx. Past that witx-bindgen only admits a single return value, and requires that return value be an expected when handling an InterfaceFunc.

So I'd imagine we could do the same for now without too much trouble. It sounds like the restriction to a single return value is temporary and eventually it will support full multi value (as wasm does) but we can burn that bridge when we get there.

That being said should union's still translate to usize, would it make sense to add a WasmResult type builtin that we translate to given we can expect every function to return a value of this kind?

@jedisct1
Copy link
Owner

Oh no, this is worse. From the 0.8 to 0.9 changes in the witx file (the ABI remains the same):

     (param $symmetric_key $symmetric_key)
     (param $symmetric_key_id (@witx pointer u8))
     (param $symmetric_key_id_max_len $size)
-    (result $error $crypto_errno)
-    (result $symmetric_key_id_len $size)
-    (result $version $version)
+    (result $error (expected (tuple $size $version) (error $crypto_errno)))

If there is more than 1 result, the list must be wrapped in a tuple.

@jedisct1
Copy link
Owner

That being said should union's still translate to usize, would it make sense to add a WasmResult type builtin that we translate to given we can expect every function to return a value of this kind?

The union should be decomposed into its err and ok components.

I'm currently trying to do something along these lines

        // func.results doesn't contain the actual results; it's an array with only one element named "error";
        // this is a union that contains both the error type and the first returned value.
        // Unlike other return values, the first returned value (which is not the value the function returns -- that is the error)
        // is not encoded as a pointer, but must be transformed into one.
        let results = &func.results;
        assert_eq!(results.len(), 1);
        let results = &results[0];
        assert_eq!(results.name, "error");
        let v = match results.tref.type_().as_ref() {
            witx::Type::Variant(x) => &x.cases,
            _ => panic!("Unexpected type for the result wrapper"),
        };
        assert_eq!(v.len(), 2);
        let mut error_ref = None;
        let mut results_ref = None;
        for vv in v {
            match vv.name.as_str() {
                "err" => {
                    error_ref = vv.tref.as_ref();
                }
                "ok" => {
                    results_ref= vv.tref.as_ref();
                }
                _ => panic!("Unexpected member in the result wrapper"),
            }
        }
        let error_ref = match error_ref {
            None => panic!("Incomplete result wrapper"),
            Some(error_ref) => error_ref,
        };
        let return_value = &Self::param_to_as("err", error_ref);
        // The returned value must be representable as a single word
        assert_eq!(return_value.len(), 1);
        let return_value = &return_value[0];
        ...
       // and now explode `results_ref` into an actual array if we have a tuple, or a single element if we don't

@jedisct1
Copy link
Owner

The union for the result and returned values doesn't make sense; it doesn't represent the actual binary interface.

@jedisct1
Copy link
Owner

Tuple seems to be something new.

@thunderseethe
Copy link
Author

tuple is one of the new syntax sugars. It desugars to a record with numerical fields (record ("𝒊" <intertype>)*) for 𝒊=0,1,...

@jedisct1
Copy link
Owner

jedisct1 commented Mar 31, 2021

Got it. So there must be something like a is_tuple() method to check if a record is a structure or a tuple.

These two structures are actually very different. I don't expect the generated code to be the same in both cases, even less so in documentation. So maybe we should introduce a proper Tuple AS type?

In fact all the types with is_*() functions are not really sugars, they hide multiple types (as needed for code generation) behind the same name, and should probably map to different AS types.

@jedisct1
Copy link
Owner

Honestly, this is getting so confusing that I'd rather delete everything and start over from scratch based on 0.10 (so that we don't have to do it again when that one will be released). Too bad it worked perfectly with 0.8 :(

@thunderseethe
Copy link
Author

I'm not sure I follow, in my mind this is simpler than 0.8.

we only need to support the base set of types

intertype ::= f32 | f64
            | s8 | u8 | s16 | u16 | s32 | u32 | s64 | u64
            | char
            | (list <intertype>)
            | (record (field <name> <id>? <intertype>)*)
            | (variant (case <name> <id>? <intertype>?)*)

and generate code for them, and then if we want we can handle special case behavior such as Tuple, Enum, etc. But we don't have to as long as we handle the core types laid out properly

@jedisct1
Copy link
Owner

But taking the example of a tuple. The emitted code would be... a tuple. Not a structure with field names.

A Variant also now represents one of:

  • an enum
  • a set of integers supporting boolean operations
  • a tagged union
  • an optional type

The code to generate is completely different for each of them.

@thunderseethe
Copy link
Author

thunderseethe commented Mar 31, 2021

Right I'm saying we don't need to emit all those special cases at outset.

Like with variant, as long as we emit a valid assemblyscript object that represents a witx::Variant that same object can be re-used for each of enum, union, optional, expected, or what have you.

Similarly for records, as long as we emit reasonable assemblyscript for witx::RecordDatatype, that inherently handles Tuple, Flags, or any of the other record syntax sugars. Specifically for tuple we will have to emit a structure with field names of some fashion, assemblyscript does not currently offer a tuple type.

@jedisct1
Copy link
Owner

RecordDatatype is internal machinery, not something we can expose to users.

That would be horrible to use, especially with things such as tuples and flags that are completely different (how would you OR tuples?)

as-witx also supports Zig and text outputs (this contains absolutely all the information from the WITX files, except that it's readable and easy to parse - and this is generated code for Zig).

It would be nice to keep it generic and eventually useful for other languages.

@thunderseethe
Copy link
Author

I think I'm not making my point clearly so let me try to tackle it from a different angle. RecordDatatype has to be exposed to users regardless of tuple or flags existing. When you generating bindings for something like

(record $foo 
   (field $a i32)
   (field $b i32))

whether or not we have tuples or flags we'd still want something like

@unmanaged
export class Foo {
   public a: i32;
   public b: i32;
   
   constructor(a: i32, b: i32) { ... }
   
   // the rest of the class stuff we generate
}

and we'd want to generate something similar in text or zig respectively.

Given that is the case, tuple and flags both desugar into an instance of (record ...) and so can use the same code that generates the above. This gives us an easy jumping off point to get something working and tested. From there we can incrementally specialize the code we generate based on whether it's a tuple a flag or whatever, but as long as we generate correct code for record we are good.

@jedisct1
Copy link
Owner

jedisct1 commented Apr 1, 2021

Taking the example of returned values. If the returned value is a struct, it will be a single pointer. If the returned value is a tuple, it will expand to one pointer per member. But both are RecordDataType.

@thunderseethe
Copy link
Author

thunderseethe commented Apr 1, 2021

That is not the case, both a tuple and a struct would be converted to a record and then we would return a pointer to that record.

Atleast for an MVP, as I said we can always layer specialized handling on top of the general base codegen. But the point is we don't have to, we can ignore the specialized variants of thing and still generate correct code.

@jedisct1
Copy link
Owner

jedisct1 commented Apr 1, 2021

Was there an ABI change?

Previously, when the WITX file had multiple results, the first one was the return value of the function, and other ones became additional parameters, as individual pointers.

This change is assumed to not change the ABI, but we now have a Tuple.

@thunderseethe
Copy link
Author

The change you linked is a change in the original witx file. What does it have to do with what code gets generated?

I believe they just changed that function to return a tuple instead of using output parameters.

@jedisct1
Copy link
Owner

jedisct1 commented Apr 1, 2021

Ouch, good to know. I didn't expect that change to also change the function signature.

@jedisct1 jedisct1 merged commit 6625b89 into jedisct1:master Apr 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants