Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Resource limits for GC'd memory allocations #35

Closed
bvibber opened this issue May 17, 2018 · 11 comments
Closed

Resource limits for GC'd memory allocations #35

bvibber opened this issue May 17, 2018 · 11 comments

Comments

@bvibber
Copy link

bvibber commented May 17, 2018

I'm researching using WebAssembly to implement sandboxed plugins to run inside a web app. This means running untrusted or less-trusted Wasm code inside an HTML/JavaScript environment, which itself is running in a browser engine.

Denial of service concerns are critical in this scenario: if the Wasm code can cause the browser engine to crash or become unresponsive, it may be difficult for a user to recover -- especially if the plugin is auto-loaded into the web app environment, there may be no user-visible way to disable it through the app's user interface.

A host environment can easily limit the amount of linear memory that a module can allocate with compile-time validation of the memory's maximum size. (Overflowing the stack with deep recursion, and long loops that hang the CPU, are separate issues which are at least partially mitigated by the browser halting execution at X stack depth or interrupting loops after a timeout.)

GC'd memory in the current proposal doesn't seem to have any limit, unless there's a global limit applied by the embedding engine.

Note that SpiderMonkey does not appear to place limits on GC'd JavaScript memory as of Firefox 60 and you can allocate tens of gigabytes in a few seconds, slowing a machine or even hanging it -- see #34 (comment)

screen shot 2018-05-17 at 11 14 41 am

Proposal: add a notion of per-origin resource limits, where the origin can be narrowed from "belongs to the embedder, use default limit behavior if any" to "belongs specifically to this Wasm module, with this maximum allocation limit".

The limit could be embedded in the binary similarly to the max size of a memory definition, able to be statically detected and verified by an embedder before compilation and instantiation.

Alternately, it could be specified in the embedding API, for the host JavaScript/whatever app to apply its own limit, but I'm not sure what's the best place to stick an option flag.

@kmiller68
Copy link

This sounds like it's a more fundamental problem than just Wasm. If the embedding JS provides a function to the plugin's instance that incidentally allocates memory then that memory presumably would not count against the instance. Thus an attacker could likely denial of service by repeatedly calling that function and ensuring it gets retained by the GC.

Also, running untrusted code from another origin inside your realm is super dangerous in general. There is no way in JS to stop that code from accessing any object in your realm provided you give them access to any other object. With GCs in wasm, that's also likely to be pretty hard to prevent for any useful plugin.

@bvibber
Copy link
Author

bvibber commented May 17, 2018

This plugin model is 100% dependent on the provided module imports actually being a safe interface, yes. Eg, don't have emscripten_eval ;)

@bvibber
Copy link
Author

bvibber commented May 17, 2018

In the HTML/JS embedding it should be possible to get an added safety layer by loading a Wasm module in a locked-down iframe, which would keep the JS realms distinct. That still leaves me with the issue of allocations within the iframe, though.

This would also complicate data passing; async window.postMessage() should work but I'm not sure if one can make a synchronous call.

@titzer
Copy link
Contributor

titzer commented May 18, 2018

Seems that this is more of an embedder concern to me, and not something that needs to be specified in the binary of a module. If the embedder, as the creator of an engine, or as it "loads" a module, sets its maximum heap size for GC'd objects, the engine will produce an OOM when the application gets close to/exceeds that limit. As for long-running loops, engines usually have a way to interrupt long running loops. This can be done through the V8 API, but IMO it'd generally be dangerous to offer this as a Web API (see historical problems with java.lang.Thread.stop()).

I'm not sure if your use case means that you are an embedder (i.e. you embed a JS/wasm engine) or you are simply some JS/WASM code running on an unmodified browser.

@bvibber
Copy link
Author

bvibber commented May 18, 2018

@titzer my envisioned 'plugin host' is a web app running in an unmodified browser and the 'plugin' is a Wasm module conforming to a custom import/export ABI. This means two things:

  1. I'm limited in control over the engine to whatever the JavaScript embedding API provides.
  2. The JavaScript engine probably thinks the module is part of my webapp for security purposes (eek!)

If I use an iframe intermediary to separate the host app from the plugin's JavaScript context that should give me added scripting safety inside the plugin/host API bridge, but as far as I know doesn't protect me from denial of service attacks allocating too much memory inside the iframe (whether from the Wasm module or the JS code surrounding it).

(The long loop case I'm not as worried about, as the browser already detects and halts them after a few seconds; and could be further protected against by isolating the Wasm module into a Worker thread, which would keep it from hanging the web's UI thread.)

This might be too esoteric a use case for explicit support, or it might be a case of "that's SpiderMonkey's fault for not rejecting allocations earlier, fix it there". :)

@kripken
Copy link
Member

kripken commented May 19, 2018

Maybe another option here is to do this in "usespace": the code loading the untrusted wasm can instrument it at runtime to note each allocation and deallocation, and enforce a limit itself. (That's with finalizers, without them it could still enforce a total allocation limit that does not take into account deallocation.)

@bvibber
Copy link
Author

bvibber commented May 19, 2018

@kripken I like that idea! Reminds me of the way we limit resources in user-coded templates on Wikipedia, by counting expanded parse tree nodes and such. Rewriting the binary to enforce a callback on allocation is something I hadn't thought of, but fits great with the pre-load validation model I'm envisioning.

Definitely would help to have finalizers if end up doing long-running plugins that might need to allocate and free and reallocate memory over a lifetime.

@lars-t-hansen
Copy link
Contributor

@kripken, surely only if there's also a way to trigger / wait for gc? Otherwise there seems to be a significant risk that the budget is blown by uncollected garbage.

@kripken
Copy link
Member

kripken commented May 22, 2018

@lars-t-hansen yeah, good point, that's a worrying risk.

@bvibber
Copy link
Author

bvibber commented Aug 22, 2020

I was about to file this issue but realized I filed it two years ago. :) I'm still concerned that a Wasm program can allocate unbounded memory, potentially DoSing its host.

rossberg pushed a commit that referenced this issue Feb 24, 2021
This commit fixes #34 by specifying that the flags field (which
indicates if a segment is passive) is a `varuint32` instead of a
`uint8`. It was discovered in #34 that the memory index located at that
position today is a `varuint32`, which can be validly encoded as `0x80
0x00` in addition to `0x00` (in addition to a number of other
encodings). This means that if the first field were repurposed as a
single byte of flags, it would break these existing modules that work
today.

It's not currently known how many modules in the wild actually take
advantage of such an encoding, but it's probably better to be safe than
sorry!

Closes #34
rossberg added a commit that referenced this issue Feb 24, 2021
… op proposal (#35)

Adds table.size, table.grow, table.fill to overview, spec, interpreter, and tests, as decided at recent CG meeting. Also adds a few more tests for table.get and table.set.

Also, change interpreter's segment encoding to match bulk ops proposal, addressing #18. (Not updated in spec yet, since corresponding spec text is still missing from bulk ops proposal.)
@tlively
Copy link
Member

tlively commented Nov 1, 2022

Closing this as outside the scope of the proposal, but if anyone feels this is still worth discussing, please add it to a subgroup meeting agenda.

@tlively tlively closed this as completed Nov 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants