Description
Mac CI runners sometimes see worker crashes, e.g. worker 'gw2' crashed...
. Seems to trace back to matplotlib's gca()
and gcf()
. The culprit is usually test_binarygrid_util.py::test_mfgrddisv_modelgrid
, I'm not yet sure why.
Examples:
- https://github.com/modflowpy/flopy/runs/7748574375?check_suite_focus=true#step:9:1731
- https://github.com/modflowpy/flopy/runs/7734831141?check_suite_focus=true#step:9:1730
This may be related to a known pytest-xdist
issue where tests almost always run in the main thread, but are not guaranteed to. Related discussions:
- Question: Is it expected, that the tests run in the main thread? pytest-dev/pytest-xdist#469
pytest-xdist
breaks asyncio-based code expecting to be run in the main thread pytest-dev/pytest-xdist#620- Intermittent parallel test crash on macOS pytest-dev/pytest-xdist#739
- (via xdist) Intermittently ending up on non-main thread pytest-dev/execnet#96
Matplotlib is not thread-safe, but it does not require the caller to be on the main thread. Perhaps there is a weird limitation on the Mac backends.
If above is the root cause, a possible workaround is to check if the test is on the main thread and skip if not, e.g.:
if threading.current_thread() is not threading.main_thread():
pytest.skip(reason="not on main thread")
An alternative could be to add a pytest marker and separate CI job just for testing plot functions and run them serially on CI. There aren't that many so it shouldn't increase CI runtimes much.
Update: trying agg
non-interactive backend for Mac CI in #1495