-
Notifications
You must be signed in to change notification settings - Fork 52
i64.atomic.wait has no matching futex syscall on Linux #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I believe, based on the discussion in #7, that it's possible to implement |
that doesn't actually work, there can be a lost wakeup: assuming little endian:
|
Ah, true. Sorry for the rookie mistake. |
I don't think you can use raw futexes even for the 32-bit wait. The wait/notify mechanism has some guarantees, such as ordered wakeup, that raw futexes do not seem to provide. |
Right, I don't think it was ever the intention that you could directly replace these instructions with raw futex syscalls. |
Another modest plea to drop
Additionally, allowing spurious wakeups eases low-level implementations of I believe the ordering issue can be resolved by placing manual seqcst fences around the futex calls, though syscalls already provide a memory barrier so that would be redundant. @lars-t-hansen could you elaborate what you mean by "ordered wakeup"? |
I'm also curious about why we decided to disallow spurious wakeups. WebAssembly/design#1019 (comment) suggests it is because JS doesn't allow them, but I don't know why that is. |
JS supports 64-bit TypedArray in wait and notify: https://tc39.es/ecma262/#sec-atomics.wait handles BigInt64Array, and notify is size-agnostic but allows that type of view. It was never (except maybe the very early days, when we were starting with something similar to what I believe NaCl had) intended for futexes to be able to directly express the semantics of wait and notify. The prohibition on spurious wakeups and other design artifacts are there to push complexity into the implementation and away from the user (after all this is/was JS). Sticking to seq_cst for the first version is similarly motivated. Simply adding shared memory and atomics to JS was highly controversial, and compromises that reduced nondeterminism at the cost of some performance were necessary. The intent of the ordered wakeup rule is fairness and (again) managing nondeterminism: if an observer can determine that one thread blocked in a wait before another, then waking one thread on that location will for sure wake the first one. |
I should add: Compatibility with JS is a big deal for wasm on the web and since JS is still much more important than wasm, JS controls the agenda to some extent. But that fact does not prevent there from being additional instructions that work better in other environments. IMO we should be very open to proposals that add instructions that allow for better performance at the expense of nondeterminism across systems and implementations. But I do think those should be framed as new proposals. The situation is similar to SIMD: At the moment, wasm SIMD has no nondeterminism and SIMD instructions work the same way (with some perf compromises) on different hardware platforms. But it is expected that the next version of wasm SIMD will have instructions that work (and are known to work) well on some hardware and less well on other hardware, or instructions that have variable amount of imprecision and therefore add nondeterminism, all in the interest of performance. |
@lars-t-hansen your prompt response is appreciated and well understood. While it's unfortunate that the semantics of JS atomics wait/notify diverges from kernel-level wait/notify primitives for the sake of eliminating non-determinism, I understand the design constraint of maintaining semantic compatibility with JS. @tlively Here's one example (of which you may or may not be aware) where disallowing spurious wakeups in the primitive comes in handy:
This avoids a recheck of |
@rianhunter I also depend on the absence of spurious wakes in the code that initializes memory when threads are enabled: https://github.com/llvm/llvm-project/blob/adcd02683856c30ba6f349279509acecd90063df/lld/wasm/Writer.cpp#L728-L768. That's not user-level code, but it would be expressible in C. However, that code could be trivially made to handle spurious wakeups by just waiting in a loop. |
In general I think that there are some warts in the current JS design. Disallowing spurious wakeups at the JS API level just pushes that checking into the implementation, and it's not obvious that it really simplifies user code all that much -- user code has to be aware of the complicated semantics of waking up in any case. (Though it's probable that fairness is a useful property and it's possible fairness and no-spurious-wakeup are connected.) It's been pointed out that the return value from |
Just for clarity's sake, the reason to allow spurious wakeups in the primitive is that not all systems have an efficient way of preventing spurious wakeups. This is less applicable to big systems and more applicable to small low-level / embedded systems. The impact consideration on user code isn't that allowing spurious wakeups simplifies user code but rather that it doesn't significantly complicate user code (e.g. @tlively's example). |
I am the implementer of libstdc++'s atomic wait. libstdc++ currently supports using futex() on platforms which have it, I also anticipate supporting platforms that have __ulock_wait/wake in a future release (possibly GCC13). For platforms which do not have such a mechanism, there is a fallback implementation built on mutex/condvar. For those types which fit into the underlying platform's wait/notify mechanism the futex() is used directly. If the atomic type T does not fit, it is proxied through another waited address. There are limited number of such addresses and they are shared, so there is an additional potential for spurious wakeups in this case. The standard allows for spurious wakeups and there are no guarantees on ordering.
|
Hey, is waking up waiters in the order they waited still an invariant? I couldn't see this in the spec? Also how are people approaching this since 64 bit wait (and timeouts) are not available on all platforms? I'm thinking of just using the address to index into a table with a condvar |
Yes - see the explanatory note here.
|
I noticed that the
i64.atomic.wait
instruction can't be translated directly to thefutex
syscall on Linux, since Linux only supports 32-bitfutex
operations. I'm concerned that non-web embeddings on Linux will need to implement their own wait/notify functionality instead of being able to use Linux's directly.Web embeddings will have to implement their own wait/notify functionality anyway to support primitives like
Atomics.waitAsync
Related:
#7
StackOverflow: Linux futex for 64-bits
The text was updated successfully, but these errors were encountered: