Closed
Description
Required prerequisites
- Make sure you've read the documentation. Your issue may be addressed there.Search the issue tracker and Discussions to verify that this hasn't already been reported. +1 or comment there if it has.Consider asking first in the Gitter chat room or in a Discussion.
Problem description
I have a segmentation fault on macos that only appears using the conda builds of python. I haven't been able to solve this one myself, sorry.
In short: When using the package I've built with pybind11, I can not import the libraries from python without a segfault. I've verified this with python 3.6, 3.9, 3.10, and using the latest version of pybind11. I have a stand-alone repository that reproduces this bug.
Here is the stack track when running with lldb
, it appears to be related to take_gil
>>> import larcv
Process 24818 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
frame #0: 0x00000001041afc17 libpython3.10.dylib`take_gil + 71
libpython3.10.dylib`take_gil:
-> 0x1041afc17 <+71>: movq 0x10(%rax), %r13
0x1041afc1b <+75>: leaq 0x1b0(%r13), %r12
0x1041afc22 <+82>: movq %r12, %rdi
0x1041afc25 <+85>: callq 0x1042e1212 ; symbol stub for: pthread_mutex_lock
Target 0: (python) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
* frame #0: 0x00000001041afc17 libpython3.10.dylib`take_gil + 71
frame #1: 0x0000000104226230 libpython3.10.dylib`PyGILState_Ensure + 48
frame #2: 0x0000000101e495df pylarcv.cpython-310-darwin.so`___lldb_unnamed_symbol1$$pylarcv.cpython-310-darwin.so + 63
frame #3: 0x0000000101e490a6 pylarcv.cpython-310-darwin.so`PyInit_pylarcv + 118
frame #4: 0x00000001001fd17e python`_imp_create_dynamic + 1486
frame #5: 0x00000001000e75a5 python`cfunction_vectorcall_FASTCALL + 85
frame #6: 0x00000001001b2b9a python`_PyEval_EvalFrameDefault + 2986
frame #7: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #8: 0x00000001001c16ee python`call_function + 174
frame #9: 0x00000001001b8fec python`_PyEval_EvalFrameDefault + 28668
frame #10: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #11: 0x00000001001c16ee python`call_function + 174
frame #12: 0x00000001001b79b2 python`_PyEval_EvalFrameDefault + 22978
frame #13: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #14: 0x00000001001c16ee python`call_function + 174
frame #15: 0x00000001001b6cd3 python`_PyEval_EvalFrameDefault + 19683
frame #16: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #17: 0x00000001001c16ee python`call_function + 174
frame #18: 0x00000001001b6cd3 python`_PyEval_EvalFrameDefault + 19683
frame #19: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #20: 0x00000001001c16ee python`call_function + 174
frame #21: 0x00000001001b6cd3 python`_PyEval_EvalFrameDefault + 19683
frame #22: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #23: 0x000000010008577b python`object_vacall + 427
frame #24: 0x0000000100085a29 python`_PyObject_CallMethodIdObjArgs + 249
frame #25: 0x00000001001f8a64 python`PyImport_ImportModuleLevelObject + 3076
frame #26: 0x00000001001b8410 python`_PyEval_EvalFrameDefault + 25632
frame #27: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #28: 0x00000001001aa979 python`builtin_exec + 345
frame #29: 0x00000001000e75a5 python`cfunction_vectorcall_FASTCALL + 85
frame #30: 0x00000001001b2b9a python`_PyEval_EvalFrameDefault + 2986
frame #31: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #32: 0x00000001001c16ee python`call_function + 174
frame #33: 0x00000001001b8fec python`_PyEval_EvalFrameDefault + 28668
frame #34: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #35: 0x00000001001c16ee python`call_function + 174
frame #36: 0x00000001001b79b2 python`_PyEval_EvalFrameDefault + 22978
frame #37: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #38: 0x00000001001c16ee python`call_function + 174
frame #39: 0x00000001001b6cd3 python`_PyEval_EvalFrameDefault + 19683
frame #40: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #41: 0x00000001001c16ee python`call_function + 174
frame #42: 0x00000001001b6cd3 python`_PyEval_EvalFrameDefault + 19683
frame #43: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #44: 0x000000010008577b python`object_vacall + 427
frame #45: 0x0000000100085a29 python`_PyObject_CallMethodIdObjArgs + 249
frame #46: 0x00000001001f8a64 python`PyImport_ImportModuleLevelObject + 3076
frame #47: 0x00000001001b8410 python`_PyEval_EvalFrameDefault + 25632
frame #48: 0x00000001001b0588 python`_PyEval_Vector + 376
frame #49: 0x00000001002277a9 python`PyRun_InteractiveOneObjectEx + 1049
frame #50: 0x000000010022640a python`_PyRun_InteractiveLoopObject + 122
frame #51: 0x0000000100225cbf python`_PyRun_AnyFileObject + 63
frame #52: 0x000000010022a106 python`PyRun_AnyFileExFlags + 118
frame #53: 0x0000000100250f2f python`pymain_run_stdin + 175
frame #54: 0x000000010025057d python`pymain_run_python + 509
frame #55: 0x0000000100250335 python`Py_RunMain + 37
frame #56: 0x0000000100251910 python`pymain_main + 64
frame #57: 0x00000001000026d8 python`main + 56
frame #58: 0x000000010049a51e dyld`start + 462
Reproducible example code
This repository can reproduce the bug. Sorry if you wanted something smaller, this is about as small as I can make it, and it is nearly stand alone - obviously, you need conda to run it...
[git@github.com:coreyjadams/larcv3-pybind11-example.git](git@github.com:coreyjadams/larcv3-pybind11-example.git)
To replicate the bug, you need to be on Mac OS (I am on Monteray, the latest) and using miniconda. I created an environment for each test I did:
conda create -n test-env-python-3.10 # Accept any questions, etc
conda activate test-env-python-3.10 # Activate the environment
conda install python=3.10 cmake scikit-build # The dependencies are just build systems.
Then, after cloning the repository I linked above, one can do:
```bash
git submodule update --init # pybind11 is a submodule here
python setup.py build # Trigger scikit-build to run cmake
python setup.py install
From a different directory (otherwise, it tries to import the larcv
folder in the repo), do:
>>> import larcv
And it ought to reproduce the crash.
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
henryiii commentedon May 11, 2022
Conda doesn't support building from python, only from Conda-build. You are likely mixing the system compilers and the conda compilers, causing the crash. Try
conda install compilers
- that might get it to use the conda compilers (make sure you remove any caching, like_skbuild
).wolfv commentedon Jun 29, 2022
I do see this issue as well on macOS x64 -- but I am pretty sure I am using the conda compilers :)
I tried to add
-undefined dynamic_lookup
which helped in the past, and I tried to remove theCMAKE_STRIP
step, but none of that helped so far. Will investigate further.It's failing for us for
rclpy
which is a dependency ofROS
, the robot operating system. Same exact error.wolfv commentedon Jun 29, 2022
Hm, I managed to replicate the issue with your example
larcv
code.The problem seems to boil down to not explicitly link
Python
in the lower level libraries (or anywhere) and to trust"-undefined dynamic_lookup"
.I've added
and removed any instances of linking to
${Python_LIBRARIES}
and things then seem to work. I think thepybind11_add_module
automatically sets that linker flag already.coreyjadams commentedon Jun 29, 2022
@wolfv thanks for this tip! I will test it out tomorrow and get back to you, that'd be awesome to have this resolved.
wolfv commentedon Jun 29, 2022
In my case,
pybind11_add_module(blabla SHARED ...)
did not work, howeverpybind11_add_module(blabla MODULE ...)
works.python
executable tolibpython
and disable the shared library astral-sh/python-build-standalone#540