-
Notifications
You must be signed in to change notification settings - Fork 56
AssertionError: daemonic processes are not allowed to have children #76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello,
It is a very common practice to integrate pools within long-running services. One of the desires in such scenario is that, if something goes wrong, the service can quickly terminate so that its supervisor (supervisord, systemd, Docker or anything else) can collect the state and re-start it properly. This to guarantee the continuity of service. Not setting processes as You can easily verify this issue with import time
import multiprocessing
import concurrent.futures
def process_function():
time.sleep(5)
pool = concurrent.futures.ProcessPoolExecutor()
# Simulate the scheduling of many long running tasks
for _ in range(1000):
pool.submit(process_function)
raise Exception("Boom Baby!")
print("You waited the completion of 1000 tasks but, as your main program gave up, you wasted 'em all!") You might argue that it's responsibility of the developer to correctly handle exceptions within their main loop and terminate the pool accordingly if they cannot use a context manager. Nevertheless, reality is more complex than that and you definitely don't want to come on a Monday morning just to notice your service has been hanging since you drank your first Friday night beer because of something as trivial as an uncaught exception (been there, done that). The reason why I did not expose the Could you better elaborate the use case you have in mind? Usually piling up nested processes is a recipe for a disaster. Is there something which forces you to do so? |
To elaborate on the use-case, I'm running "plugin" code provided by external party and verifying the results for correctness and performance. Imagine that the external code is required to solve a difficult equation (a simplification but the right idea), and you want to verify that it handled various inputs and edge cases correctly. The external code is only required to supply a callable entrypoint, there is no restriction on how it should be implement a solve (in particular, it should be able to spawn worker threads or processes if it needs to). I want to have a strict timeout on the external code, which is where pebble comes in - the timeout actually works even if the code under subprocess misbehaves. So, the issue is that using pebble's |
Have you considered the issue deriving from the timeout killing the worker processes and leaving their children orphan? This might especially prove problematic in your case where you are running third party code which might hang or burn CPU for a very long time while the main loop already timed out long ago and continued spawning new test cases. On a long run, you would find your environment with a lot of runaway processes. How would you handle those? Example reproducing the issue. import subprocess
import pebble
def function():
subprocess.run(['ping', '192.168.1.1', '-c', '10'])
with pebble.ProcessPool() as pool:
future = pool.schedule(function, timeout=3)
future.result() |
I have considered it, it's what I called the "zombie grandchildren" in the original post. The possibility may need to be handled, perhaps by the top-level process auditing the grandchildren, although I would not really expect that to be the responsibility of pebble itself. And, it's sort of an outlier case on the unhappy path - for pebble's API, I would think it's not a convincing enough reason to disallow well behaved workers which are correctly managing their own children? |
I think it's more of an issue of how many footguns a library should provide to its users. I will try to see whether to expose the process I will also look into the other issue of yours as soon as I have some spare time, |
Signed-off-by: Matteo Cafasso <[email protected]>
On Linux, the child process inherits the parent file descriptors. If the parent end of the pipes is not closed, the child will be unable to detect whether the parent exited or not. As a consequence, workers cannot detect when the pool process has terminated abruptly. Signed-off-by: Matteo Cafasso <[email protected]>
Signed-off-by: Matteo Cafasso <[email protected]>
Daemon processes do not allow spawning children due to the fact their termination might leave the children orfan. This limits some Use Cases in which the pool workers need to run functions from module which internally use `multiprocessing`. Instead of setting the worker processes as daemons, we rely on `atexit` to terminate them on exit. The drawback of this implementation is that if the main process exits, granchildren might end up being orphan. Signed-off-by: Matteo Cafasso <[email protected]>
I just pushed to master the fix which should allow your use case. I still need to test it on Windows and document the drawbacks of this implementation. I am also planning to tackle a couple more issues in this release. In the meantime, you can clone this repo to get yourself going. |
I've just tried it out, works fine. Thank you Matteo |
Signed-off-by: Matteo Cafasso <[email protected]>
Fix released in |
Uh oh!
There was an error while loading. Please reload this page.
First of all, I am aware of #31 but it was closed by the submitter without a proper resolution.
concurrent.futures.ProcessPoolExecutor
doesn't have this limitation on pools within pools. It can have a nested process pool with no problem. I use the example from the docs verbatim, in a fileeg.py
.But with pebble's process pool, which I was usually able to use as a drop-in replacement/improvement, pools spawning pools is disallowed:
Why is that? Is it only a preventative measure to prevent the possibility of "zombie grandchildren" 🧟 processes? Is there any public-facing way to lift the restriction from the pebble
ProcessPool
? (Pebble 4.6.0 on macOS)Thanks for such a helpful and easy to use library!
The text was updated successfully, but these errors were encountered: