Skip to content

Re-proposal: WebAssembly needs forwards-compatibility for extensions #1322

Closed
@rcombs

Description

@rcombs

This was discussed previously in #1161, but I don't think the conclusion reached there is tenable, so I'm reopening further discussion.

Currently, if a wasm file uses any opcode that the implementation doesn't support, validation and compilation will fail. This means that entire libraries have only the following options when they support new extensions optionally:

  • be compiled 2^(number of optional extension dependencies) times and select a variant in JS by running validate() on small modules
  • be compiled (number of optional extension dependencies) times and do the same, with the assumption that all implementations gain features in the same order
  • drop support for implementations that don't support the theoretically-optional dependencies
  • never introduce code using the theoretical available enhancements

The first 2 options produce substantial additional friction on library developers, and requires the entire dependency chain to play along; it's also not entirely clear how to do this kind of dispatch with e.g. emscripten-built code. The third is untenable for many users (myself included) for compatibility reasons (I have to support a variety of embedded systems running browser engines that are often varying numbers of years old), and the fourth is what I've seen in practice to date with e.g. JavascriptSubtitlesOctopus.

As a library developer, I find this situation very surprising. The standard mechanism to support extension features on a given platform is to do a runtime availability test (CPUID or the like), then set up a vtable of function pointers using either the C implementations, or ones using whichever extensions are available.

One line in the previous discussion that surprised me, assuming I'm understanding it correctly:

Most new features are pretty huge, such as SIMD and threads, and can't really be picked at runtime based on fine-grained feature detection.

Deciding whether or not to use SIMD (or wider vector sizes, or FMA, or hardware crypto, or particular small groups of instructions added in particular extensions) at runtime is extremely common; we do it in both ffmpeg and libass. FFmpeg also makes runtime decisions about whether or not to use threading; these aren't based on availability (as all currently-supported platforms have thread availability determined at compile-time), but could trivially be made to be for such a platform.

Also saw this:

No, embedder fail validation on unknown opcodes. The way opcodes are encoded we don't know how many immediate they have, so an unknown opcode simply cannot be skipped, though you could skip the entire function and trap on entry.

This is a legitimate concern: because of the variable-length opcode structure, once an unknown op is parsed, the parser loses sync. On physical variable-length-opcode platforms like x86, this isn't usually an issue (since you can just branch over the unavailable instruction and the processor will have correct sync after it), but it is for wasm as it needs to parse ahead-of-time. However, the concept suggested here (simply trap on entry to any function with unsupported opcodes) is entirely valid, and would address this issue for the vast majority of real use-cases. The only cases it wouldn't would be things like branching over inline ASM, but as no current implementation supports inline wasm in C (afaik) and those kinds of cases are quite uncommon anyway (it's usually more efficient to do vtable dispatch), I don't think it'd be an issue in practice.

In the absence of some sort of runtime dispatch system, I wouldn't expect to see much adoption of extension features from existing native library developers. I'd been planning on writing some wasm SIMD routines for libass and perhaps for ffmpeg, but running into this issue, it seems like I wouldn't have a clear path for that work to actually be used by anyone, at least not for the next few years. Whole-program dispatch just isn't how any project I've ever worked on operates, and I don't think the will exists to adapt to it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions