Skip to content

Warn before busy waiting on futex #15957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mmarczell-graphisoft
Copy link
Contributor

Other code paths print this warning before busy waiting, but not this one. This will help users keep main thread code "clean" from blocking calls.

@@ -25,6 +25,7 @@ int emscripten_futex_wait(volatile void *addr, uint32_t val, double max_wait_ms)
// __builtin_wasm_memory_atomic_wait32 so we call out the JS function that
// will busy wait.
if (!_emscripten_thread_supports_atomics_wait()) {
emscripten_check_blocking_allowed();
ret = _emscripten_futex_wait_non_blocking(addr, val, max_wait_ms);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. This line does a non-blocking wait, so why check for blocking before it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job of emscripten_check_blocking_allowed is to report the warning about busy waiting on main (e.g. Blocking on the main thread is very dangerous..).

However, this is normally called before emscripten_futex_wait is called. For example in pthread_cond_timedwait.c and pthread_join.c. I guess there is no harm in also calling it from here.. but generally I think it should have already been called before we get here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbc100 Here is the stack trace that displays the warning with this change, but hasn't done it before.
I'm not familiar enough with the architecture to know where the better place for this call would be...?
image

@sbc100
Copy link
Collaborator

sbc100 commented Jan 13, 2022

Looking at the original PR that introduced this change (#9579) it looks like this omission was deliberate:

    Warn on blocking on the main thread (#9579)
    
    Add a new ALLOW_BLOCKING_ON_MAIN_THREAD option, set to
    1 by default. If unset, it will throw when blocking on the main thread.
    If 1 as in the default case, we warn in the console about possible
    issues and link to the docs about that.
    
    This ignores things like mutexes which are usually brief, and checks
    for pthread_join or pthread_cond_wait operations, which tend to be
    longer and run the risk of Web-specific proxying-related deadlocks.
    
    Add pthread_tryjoin_np which does a non-blocking join, which always
    works on the main thread.

@mmarczell-graphisoft
Copy link
Contributor Author

@sbc100 Thanks for digging that up! It then boils down to: are mutexes really usually brief?

My motivation for adding this warning is that we are currently porting a quite large C++ library to the web, and I wanted some builtin help in avoiding absolutely all cases that could potentially cause a freeze on the main thread.

But based on what you just quoted I can accept the closing of this PR.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 13, 2022

Lets see what @kripken thinks? maybe thinks have changed since that PR was written?

@kripken
Copy link
Member

kripken commented Jan 13, 2022

Yes, I think it was intentional - we wanted to not show this in all paths. Waiting on a condition is potentially a long operation, while a mutex might be fine in the normal case with low contention. So I don't think the goal was to warn about all possible "stalls" on the main thread, just the riskiest ones, to avoid lots of false positives.

I don't know how useful the tool is in practice, though, I haven't used it myself recently.

A way to find any blocking on the main thread sounds useful too though! One way to do it might be to modify _emscripten_futex_wait_non_blocking which is called by the code changed in this PR, that is, changing _emscripten_futex_wait_non_blocking will have the same effect as this PR, but also help more code paths. Basically in the JS for that function we could check if we wait for more than 1ms, say (or if we wait at all?) and if so, log something in PTHREAD_DEBUG mode? That's just one way to do it though - thoughts?

@sbc100
Copy link
Collaborator

sbc100 commented Jan 13, 2022

Yes, I think it was intentional - we wanted to not show this in all paths. Waiting on a condition is potentially a long operation, while a mutex might be fine in the normal case with low contention. So I don't think the goal was to warn about all possible "stalls" on the main thread, just the riskiest ones, to avoid lots of false positives.

I don't know how useful the tool is in practice, though, I haven't used it myself recently.

A way to find any blocking on the main thread sounds useful too though! One way to do it might be to modify _emscripten_futex_wait_non_blocking which is called by the code changed in this PR, that is, changing _emscripten_futex_wait_non_blocking will have the same effect as this PR, but also help more code paths. Basically in the JS for that function we could check if we wait for more than 1ms, say (or if we wait at all?) and if so, log something in PTHREAD_DEBUG mode? That's just one way to do it though - thoughts?> Yes, I think it was intentional - we wanted to not show this in all paths. Waiting on a condition is potentially a long operation, while a mutex might be fine in the normal case with low contention. So I don't think the goal was to warn about all possible "stalls" on the main thread, just the riskiest ones, to avoid lots of false positives.

I don't know how useful the tool is in practice, though, I haven't used it myself recently.

A way to find any blocking on the main thread sounds useful too though! One way to do it might be to modify _emscripten_futex_wait_non_blocking which is called by the code changed in this PR, that is, changing _emscripten_futex_wait_non_blocking will have the same effect as this PR, but also help more code paths. Basically in the JS for that function we could check if we wait for more than 1ms, say (or if we wait at all?) and if so, log something in PTHREAD_DEBUG mode? That's just one way to do it though - thoughts?

That sounds like good idea. I wonder if we should replace the existing check_blocking_allowed with this new mechanism too.. to keep things simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants