Skip to content

Out-of-bounds access in pthreadpool when reducing threadpool size on macos #14321

@GregoryComer

Description

@GregoryComer

🐛 Describe the bug

When reducing the threadpool size using _unsafe_reset_threadpool (or via changing the default threadpool size in code), there is a reproducible out-of-bounds read. It sometimes manifests as native crashes and can be caught 100% of the time by ASan. I found this when troubleshooting occasional segfaults after landing #14090, but it is reproducible on the latest master post-revert).

To reproduce, update backends/xnnpack/test/ops/test_bilinear2d.py to import and call _unsafe_reset_threadpool(6):

from executorch.extension.pybindings.portable_lib import _unsafe_reset_threadpool
...
    def test_fp32_static_resize_bilinear2d(self):
        _unsafe_reset_threadpool(6) # <-- Add this line
pytest -c /dev/nul backends/xnnpack/test/ops/test_bilinear2d.py::TestUpsampleBilinear2d::test_fp32_static_resize_bilinear2d

This will sometimes cause a crash on M1 Mac. To catch with ASan, add the following lines to the top-level CMakeLists, the re-run install_executorch.py. You'll also need to set DYLD_INSERT_LIBRARIES. Running without it will print the correct value to pass.

add_compile_options(-fsanitize=address, -Wno-deprecated-declarations)
add_link_options(-fsanitize=address)

From the trace included below, it's interesting to note that it's using the pthreadpool in libtorch_cpu (?) to create the threadpool, but the version linked in ET/XNNPACK to invoke it. I'm guessing this might be the issue, as XNNPACK is using a forked version of pthreadpool.

ASan Trace

(executorch) gjcomer@gjcomer-mbp executorch % DYLD_INSERT_LIBRARIES=/Applications/Xcode_15.1.0_15C65_fb.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib python -m pytest -c /dev/nul backends/xnnpack/test/ops/test_bilinear2d.py::TestUpsampleBilinear2d::test_fp32_static_resize_bilinear2d
================================================================================================================ test session starts ================================================================================================================
platform darwin -- Python 3.10.13, pytest-8.4.1, pluggy-1.5.0
rootdir: /dev
configfile: nul
plugins: repeat-0.9.4, xdist-3.4.0, cov-4.1.0, rerunfailures-15.1, kgb-7.2, anyio-4.4.0, hydra-core-1.3.2, hypothesis-6.84.2
collecting ... W0915 16:13:50.875000 83987 site-packages/torch/distributed/elastic/multiprocessing/redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.
collected 1 item

../../../../dev =================================================================
==83987==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x00010c6e45e8 at pc 0x000152084380 bp 0x00016d966530 sp 0x00016d966528
READ of size 8 at 0x00010c6e45e8 thread T0
    #0 0x15208437c in pthreadpool_parallelize_3d_tile_2d_dynamic+0x2b4 (_portable_lib.cpython-310-darwin.so:arm64+0x120437c)
    #1 0x151013640  (_portable_lib.cpython-310-darwin.so:arm64+0x193640)
    #2 0x151040ff4 in xnn_invoke_runtime+0xe4 (_portable_lib.cpython-310-darwin.so:arm64+0x1c0ff4)
    #3 0x150f61c40 in executorch::backends::xnnpack::delegate::XNNExecutor::forward(executorch::runtime::BackendExecutionContext&)+0x170 (_portable_lib.cpython-310-darwin.so:arm64+0xe1c40)
    #4 0x150f63e5c  (_portable_lib.cpython-310-darwin.so:arm64+0xe3e5c)
    #5 0x15206330c in executorch::runtime::Method::execute_instruction()+0x1064 (_portable_lib.cpython-310-darwin.so:arm64+0x11e330c)
    #6 0x152064a4c in executorch::runtime::Method::execute()+0x36c (_portable_lib.cpython-310-darwin.so:arm64+0x11e4a4c)
    #7 0x151179418 in executorch::extension::module::Module::execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::vector<executorch::runtime::EValue, std::__1::allocator<executorch::runtime::EValue>> const&)+0x1dc (_portable_lib.cpython-310-darwin.so:arm64+0x2f9418)
    #8 0x1520f33d8  (_portable_lib.cpython-310-darwin.so:arm64+0x12733d8)
    #9 0x152151224  (_portable_lib.cpython-310-darwin.so:arm64+0x12d1224)
    #10 0x1520f6174  (_portable_lib.cpython-310-darwin.so:arm64+0x1276174)
    #11 0x1520ad808  (_portable_lib.cpython-310-darwin.so:arm64+0x122d808)
    #12 0x102548b98 in cfunction_call+0x38 (python3.10:arm64+0x1000c0b98)
    #13 0x1024e9f30 in _PyObject_MakeTpCall+0x14c (python3.10:arm64+0x100061f30)
    #14 0x1024efdec in method_vectorcall+0x258 (python3.10:arm64+0x100067dec)
    #15 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #16 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #17 0x102601668 in _PyEval_EvalFrameDefault+0x851c (python3.10:arm64+0x100179668)
    #18 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #19 0x102601668 in _PyEval_EvalFrameDefault+0x851c (python3.10:arm64+0x100179668)
    #20 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #21 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #22 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #23 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #24 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #25 0x102601668 in _PyEval_EvalFrameDefault+0x851c (python3.10:arm64+0x100179668)
    #26 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #27 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #28 0x1026012d4 in _PyEval_EvalFrameDefault+0x8188 (python3.10:arm64+0x1001792d4)
    #29 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #30 0x1024ecfd8 in _PyObject_Call_Prepend+0x134 (python3.10:arm64+0x100064fd8)
    #31 0x102575a88 in slot_tp_call+0xe4 (python3.10:arm64+0x1000eda88)
    #32 0x1026221a8 in call_function+0x28c (python3.10:arm64+0x10019a1a8)
    #33 0x1025fb050 in _PyEval_EvalFrameDefault+0x1f04 (python3.10:arm64+0x100173050)
    #34 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #35 0x102601668 in _PyEval_EvalFrameDefault+0x851c (python3.10:arm64+0x100179668)
    #36 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #37 0x10260116c in _PyEval_EvalFrameDefault+0x8020 (python3.10:arm64+0x10017916c)
    #38 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #39 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #40 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #41 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #42 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #43 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #44 0x1024ecfd8 in _PyObject_Call_Prepend+0x134 (python3.10:arm64+0x100064fd8)
    #45 0x102575a88 in slot_tp_call+0xe4 (python3.10:arm64+0x1000eda88)
    #46 0x102601294 in _PyEval_EvalFrameDefault+0x8148 (python3.10:arm64+0x100179294)
    #47 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #48 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #49 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #50 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #51 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #52 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #53 0x1025fb050 in _PyEval_EvalFrameDefault+0x1f04 (python3.10:arm64+0x100173050)
    #54 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #55 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #56 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #57 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #58 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #59 0x1025fb050 in _PyEval_EvalFrameDefault+0x1f04 (python3.10:arm64+0x100173050)
    #60 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #61 0x10260116c in _PyEval_EvalFrameDefault+0x8020 (python3.10:arm64+0x10017916c)
    #62 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #63 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #64 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #65 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #66 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #67 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #68 0x1024ecfd8 in _PyObject_Call_Prepend+0x134 (python3.10:arm64+0x100064fd8)
    #69 0x102575a88 in slot_tp_call+0xe4 (python3.10:arm64+0x1000eda88)
    #70 0x1026221a8 in call_function+0x28c (python3.10:arm64+0x10019a1a8)
    #71 0x1025fb050 in _PyEval_EvalFrameDefault+0x1f04 (python3.10:arm64+0x100173050)
    #72 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #73 0x10260116c in _PyEval_EvalFrameDefault+0x8020 (python3.10:arm64+0x10017916c)
    #74 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #75 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #76 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #77 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #78 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #79 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #80 0x1024ecfd8 in _PyObject_Call_Prepend+0x134 (python3.10:arm64+0x100064fd8)
    #81 0x102575a88 in slot_tp_call+0xe4 (python3.10:arm64+0x1000eda88)
    #82 0x1026221a8 in call_function+0x28c (python3.10:arm64+0x10019a1a8)
    #83 0x1025fb050 in _PyEval_EvalFrameDefault+0x1f04 (python3.10:arm64+0x100173050)
    #84 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #85 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #86 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #87 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #88 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #89 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #90 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #91 0x10260116c in _PyEval_EvalFrameDefault+0x8020 (python3.10:arm64+0x10017916c)
    #92 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #93 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #94 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #95 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #96 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #97 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #98 0x1024ecfd8 in _PyObject_Call_Prepend+0x134 (python3.10:arm64+0x100064fd8)
    #99 0x102575a88 in slot_tp_call+0xe4 (python3.10:arm64+0x1000eda88)
    #100 0x1026221a8 in call_function+0x28c (python3.10:arm64+0x10019a1a8)
    #101 0x1025fb050 in _PyEval_EvalFrameDefault+0x1f04 (python3.10:arm64+0x100173050)
    #102 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #103 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #104 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #105 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #106 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #107 0x1025f7440 in _PyEval_Vector+0x210 (python3.10:arm64+0x10016f440)
    #108 0x1025f1e88 in builtin_exec+0x130 (python3.10:arm64+0x100169e88)
    #109 0x10254999c in cfunction_vectorcall_FASTCALL+0x54 (python3.10:arm64+0x1000c199c)
    #110 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #111 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #112 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #113 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #114 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #115 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #116 0x1026930bc in pymain_run_module+0xd0 (python3.10:arm64+0x10020b0bc)
    #117 0x102692aa4 in pymain_run_python+0xb8 (python3.10:arm64+0x10020aaa4)
    #118 0x102692990 in Py_RunMain+0x24 (python3.10:arm64+0x10020a990)
    #119 0x10248e64c in main+0x34 (python3.10:arm64+0x10000664c)
    #120 0x19e18eb94 in start+0x17b8 (dyld:arm64e+0xfffffffffff3ab94)

0x00010c6e45e8 is located 104 bytes after 640-byte region [0x00010c6e4300,0x00010c6e4580)
allocated by thread T0 here:
    #0 0x1030539a8 in wrap_posix_memalign+0xa4 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x539a8)
    #1 0x122d0d578 in pthreadpool_allocate+0x28 (libtorch_cpu.dylib:arm64+0x53b9578)
    #2 0x122d0d654 in pthreadpool_create+0x6c (libtorch_cpu.dylib:arm64+0x53b9654)
    #3 0x152051b04 in executorch::extension::threadpool::ThreadPool::_unsafe_reset_threadpool(unsigned int)+0x44 (_portable_lib.cpython-310-darwin.so:arm64+0x11d1b04)
    #4 0x1520ea280  (_portable_lib.cpython-310-darwin.so:arm64+0x126a280)
    #5 0x1520ad808  (_portable_lib.cpython-310-darwin.so:arm64+0x122d808)
    #6 0x102548b98 in cfunction_call+0x38 (python3.10:arm64+0x1000c0b98)
    #7 0x1026221a8 in call_function+0x28c (python3.10:arm64+0x10019a1a8)
    #8 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #9 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #10 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #11 0x102621fac in call_function+0x90 (python3.10:arm64+0x100199fac)
    #12 0x1025fafe4 in _PyEval_EvalFrameDefault+0x1e98 (python3.10:arm64+0x100172fe4)
    #13 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #14 0x102601668 in _PyEval_EvalFrameDefault+0x851c (python3.10:arm64+0x100179668)
    #15 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #16 0x1024efc0c in method_vectorcall+0x78 (python3.10:arm64+0x100067c0c)
    #17 0x1026012d4 in _PyEval_EvalFrameDefault+0x8188 (python3.10:arm64+0x1001792d4)
    #18 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #19 0x1024ecfd8 in _PyObject_Call_Prepend+0x134 (python3.10:arm64+0x100064fd8)
    #20 0x102575a88 in slot_tp_call+0xe4 (python3.10:arm64+0x1000eda88)
    #21 0x1026221a8 in call_function+0x28c (python3.10:arm64+0x10019a1a8)
    #22 0x1025fb050 in _PyEval_EvalFrameDefault+0x1f04 (python3.10:arm64+0x100173050)
    #23 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #24 0x102601668 in _PyEval_EvalFrameDefault+0x851c (python3.10:arm64+0x100179668)
    #25 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #26 0x10260116c in _PyEval_EvalFrameDefault+0x8020 (python3.10:arm64+0x10017916c)
    #27 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)
    #28 0x102602e48 in _PyEval_EvalFrameDefault+0x9cfc (python3.10:arm64+0x10017ae48)
    #29 0x1024eb544 in _PyFunction_Vectorcall+0x220 (python3.10:arm64+0x100063544)

SUMMARY: AddressSanitizer: heap-buffer-overflow (_portable_lib.cpython-310-darwin.so:arm64+0x120437c) in pthreadpool_parallelize_3d_tile_2d_dynamic+0x2b4
Shadow bytes around the buggy address:
  0x00010c6e4300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00010c6e4380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00010c6e4400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00010c6e4480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00010c6e4500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x00010c6e4580: fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]fa fa
  0x00010c6e4600: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x00010c6e4680: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x00010c6e4700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x00010c6e4780: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x00010c6e4800: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==83987==ABORTING
Fatal Python error: Aborted

Versions

Reproduced on 30a904b on M1 Pro.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

Status

In progress

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions