-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Update libcxx & libcxxabi to LLVM 17.0.4 #20707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update libcxx & libcxxabi to LLVM 17.0.4 #20707
Conversation
PSTL was integrated in libc++ recently and we have to choose one out of three PSTL modes in many places, one of them being https://github.com/llvm/llvm-project/blob/373c343a77a7afaa07179db1754a52b620dfaf2e/libcxx/include/__algorithm/pstl_backends/cpu_backends/backend.h#L15-L23. This is supposed to set in `__config_site`, which is supposed to be generated from https://github.com/llvm/llvm-project/blob/main/libcxx/include/__config_site.in#L33-L36 from https://github.com/llvm/llvm-project/blob/main/libcxx/CMakeLists.txt. Given that we don't use CMake system in Emscripten, we need to hard-code this in this file. Note that this cannot be solved by just adding `-D_LIBCPP_PSTL_CPU_BACKEND_SERIAL` in `libcxx`'s `system_lib.py` flags, because this is needed in header files that are included in user code as well.
`libdispatch.cpp` is related to PSTL's `LIBDISPATCH` mode, whatever that is, which doesn't seem to be necessary for us.
`_LIBCPP_NO_EXCEPTIONS` has changed to `_LIBCPP_HAS_NO_EXCEPTIONS` in LLVM 17, so this fixes it in Emscripten-downstream code.
The wasm code size in `other.test_minimal_runtime_code_Size_hello_embind_val` increased by 50% in LLVM 17 lib update, and the reason was because `__throw_***` functions in `libcxx/include/stdexcept` started using `_LIBCPP_VERBOSE_ABORT` in non-exception mode: https://github.com/llvm/llvm-project/blob/beb121f6322be466384ffaa68b913f58862ed14c/libcxx/include/stdexcept#L227-L305 Note that they used to just call abort() until LLVM 16: https://github.com/llvm/llvm-project/blob/7cbf1a2591520c2491aa35339f227775f4d3adf6/libcxx/include/stdexcept#L220-L306 And that `_LIBCPP_VERBOSE_ABORT` calls `__libcpp_verbose_abort`, in case `_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` is NOT defined: https://github.com/llvm/llvm-project/blob/beb121f6322be466384ffaa68b913f58862ed14c/libcxx/include/__verbose_abort#L53 `__libcpp_verbose_abort` is this: https://github.com/llvm/llvm-project/blob/beb121f6322be466384ffaa68b913f58862ed14c/libcxx/src/verbose_abort.cpp#L31-L75 which calls `vfprintf` and all that jazz, increasing code size a lot. This `__libcpp_verbose_abort` apparently also existed in LLVM 16, but it didn't seem to be used then. So at the moment we define `_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` so that we don't use the new verbose version of printing.
7e99df2
to
5524101
Compare
Awesome! That makes it really easy to see what changed. Thanks for doing that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
tools/system_libs.py
Outdated
@@ -1599,7 +1599,8 @@ class libcxx(NoExceptLibrary, MTLibrary): | |||
'locale_win32.cpp', | |||
'thread_win32.cpp', | |||
'support.cpp', | |||
'int128_builtins.cpp' | |||
'int128_builtins.cpp', | |||
'libdispatch.cpp' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a tailing comma so next time we add a single file its just a one line diff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -1 +1 @@ | |||
51616 | |||
60220 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a pretty big bump that might be worth investigating. Is print being pulled in where it wasn't before for some reason?
There are a few useful linking command line tricks for tracking down why symbols get included. For example --trace-symbol
and maybe -Wl,-Map=
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this also turned out to be the __libcpp_verbose_abort
issue. We thought adding -D_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT
in libcxx's system_libs.py
would solve the issue, but those routines are in the headers. So these routines can be called whenever a user gives errorneous arguments to string
or vector
allocations, so basically any user code can pull these __libcpp_verbose_abort
in. In this case libwasmfs code pulled in __libcpp_verbose_abort
and thus the printf
family. We can add -D_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT
to libwasmfs's cflags
in system_libs.py
but that's just a band-aid for libwasmfs because we can't do that for all user code.
I guess we should just define it in https://github.com/emscripten-core/emscripten/blob/main/system/lib/libcxx/include/__verbose_abort. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping 😀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about defining it system/lib/libcxx/include/__config_site
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tools/system_libs.py
Outdated
@@ -1581,6 +1581,8 @@ class libcxx(NoExceptLibrary, MTLibrary): | |||
'-DLIBCXX_BUILDING_LIBCXXABI=1', | |||
'-D_LIBCPP_BUILDING_LIBRARY', | |||
'-D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS', | |||
# This increases too much code size | |||
'-D_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be removed now that its in the config_site?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, forgot to do that.
In emscripten-core#20707 we decided to disable `__libcpp_verbose_abort` not to incur the code size increase that brings (it brings in `vfprintf` and its family). We disabled it in adding `#define _LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` in `__config_site`. That `_NO_` macros are gone in LLVM 18.1.2, so I changed it to `#define _LIBCPP_AVAILABILITY_HAS_VERBOSE_ABORT 0` emscripten-core@2bd7c67 But apparently, when including `__availability`, that setting is overridden by this line: https://github.com/llvm/llvm-project/blob/38f5596feda3276a8aa64fc14e074334017088ca/libcxx/include/__availability#L138 because we don't define our own vendor availability annotations https://github.com/emscripten-core/emscripten/blob/193b2bff980529dbf5d067f01761ed8a6e41189f/system/lib/libcxx/include/__config_site#L4 I asked about this in llvm/llvm-project#87012 and llvm/llvm-project#71002 (comment) (and more comments below) but didn't get an actionable answer yet. This disables `__libcpp_verbose_abort` in the code directly for now, hopefully temporarily.
This updates libcxx and libcxxabi to LLVM 18.1.2. Due to llvm/llvm-project#65534 we need to update both at the same time. Things to note: - Disable hardening mode: LLVM 18.1.2 adds support for "Hardening Modes" to libcxx (https://libcxx.llvm.org/Hardening.html). There are four modes: none, fast, extensive and debug. Different hardening modes make different trade-offs between the amount of checking and runtime performance. We for now disable it (i.e., set it to 'none') so that we don't have any effects on the performance. We can consider enabling it when we ever get to enable the debug version of libcxx. - Disable C++20 time zone support: LLVM 18.1.2 adds C++20 time zone support (https://libcxx.llvm.org/DesignDocs/TimeZone.html) to libcxx, which requires access IANA Time Zone Database. Currently it seems it only supports Linux: https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/src/tz.cpp#L45-L49 So this excludes the two source files from build (which is done via `CMakeLists.txt` in the upstream LLVM) and sets `_LIBCPP_HAS_NO_TIME_ZONE_DATABASE` macro in `__config_site`. In future maybe we can consider implementing this in JS. - `__cxa_init_primary_exception` support: llvm/llvm-project#65534 introduces `__cxa_init_primary_exception` and uses this in libcxx. Like several other methods like the below in `cxa_exception.cpp`, https://github.com/emscripten-core/emscripten/blob/fbdd9249e939660cb0d20f00d6bc1897c2f3905e/system/lib/libcxxabi/src/cxa_exception.cpp#L264-L269 this new function takes a pointer to a destructor, and in Wasm destructor returns a `this` pointer, which requires `ifdef __USING_WASM_EXCEPTION__` on its signature. And that it is also used in libcxx means we need to guard some code in libcxx with `ifdef __USING_WASM_EXCEPTION__` for the signature difference, and we need to build libcxx with `-D__USING_WASM_EXCEPTIONS__` when Wasm EH is used. Also we add Emscripte EH's counterpart in `cxa_exception_emscripten.cpp` too. - This version fixes long-running misbehaviors of `operator new` llvm/llvm-project#69498 seems to fix some long-running misbhaviors of `operator new`, which in emscripten we tried to fix in #11079 and #20154. So while resolving the conflicts, I effectively reverted #11079 and #20154 in `libcxx/src/stdlib_new_delete.cpp` and `libcxx/src/new.cpp`. - Add default `__assertion_handler` to `libcxx/include`: llvm/llvm-project#77883 adds a way for vendors to override how we handle assertions. If a vendor doesn't want to override it, CMake will copy the provided [default template assertion handler file](https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/vendor/llvm/default_assertion_handler.in) to libcxx's generated include dir, which defaults to `SYSROOT/include/c++/v1`: https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/CMakeLists.txt#L72-L73 https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/CMakeLists.txt#L439 https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/include/CMakeLists.txt#L1024 We don't use CMake, so this renames the provided `vendor/llvm/default_assertion_handler.in` to `__assert_handler` in our `libcxx/include` directory, which will be copied to `SYSROOT/include/c++/v1`. - Disable `__libcpp_verbose_abort directly` in code: In #20707 we decided to disable `__libcpp_verbose_abort` not to incur the code size increase that brings (it brings in `vfprintf` and its family). We disabled it in adding `#define _LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` in `__config_site`. But that `_NO_` macros are gone in LLVM 18.1.2, and changing it to `#define _LIBCPP_AVAILABILITY_HAS_VERBOSE_ABORT 0` does not work because it is overridden by this line, unless we provide our own vendor annotations: https://github.com/llvm/llvm-project/blob/38f5596feda3276a8aa64fc14e074334017088ca/libcxx/include/__availability#L138 I asked about this in llvm/llvm-project#87012 and llvm/llvm-project#71002 (comment) (and more comments below) but didn't get an actionable answer yet. So this disables `__libcpp_verbose_abort` in the code directly for now, hopefully temporarily. Each fix is described in its own commit, except for the fixes required to resolve conflicts when merging our changes, which I wasn't able to commit separately. Fixes #21603.
This updates libcxx and libcxxabi to LLVM 18.1.2. Due to llvm/llvm-project#65534 we need to update both at the same time. Things to note: - Disable hardening mode: LLVM 18.1.2 adds support for "Hardening Modes" to libcxx (https://libcxx.llvm.org/Hardening.html). There are four modes: none, fast, extensive and debug. Different hardening modes make different trade-offs between the amount of checking and runtime performance. We for now disable it (i.e., set it to 'none') so that we don't have any effects on the performance. We can consider enabling it when we ever get to enable the debug version of libcxx. - Disable C++20 time zone support: LLVM 18.1.2 adds C++20 time zone support (https://libcxx.llvm.org/DesignDocs/TimeZone.html) to libcxx, which requires access IANA Time Zone Database. Currently it seems it only supports Linux: https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/src/tz.cpp#L45-L49 So this excludes the two source files from build (which is done via `CMakeLists.txt` in the upstream LLVM) and sets `_LIBCPP_HAS_NO_TIME_ZONE_DATABASE` macro in `__config_site`. In future maybe we can consider implementing this in JS. - `__cxa_init_primary_exception` support: llvm/llvm-project#65534 introduces `__cxa_init_primary_exception` and uses this in libcxx. Like several other methods like the below in `cxa_exception.cpp`, https://github.com/emscripten-core/emscripten/blob/fbdd9249e939660cb0d20f00d6bc1897c2f3905e/system/lib/libcxxabi/src/cxa_exception.cpp#L264-L269 this new function takes a pointer to a destructor, and in Wasm destructor returns a `this` pointer, which requires `ifdef __USING_WASM_EXCEPTION__` on its signature. And that it is also used in libcxx means we need to guard some code in libcxx with `ifdef __USING_WASM_EXCEPTION__` for the signature difference, and we need to build libcxx with `-D__USING_WASM_EXCEPTIONS__` when Wasm EH is used. Also we add Emscripte EH's counterpart in `cxa_exception_emscripten.cpp` too. - This version fixes long-running misbehaviors of `operator new` llvm/llvm-project#69498 seems to fix some long-running misbhaviors of `operator new`, which in emscripten we tried to fix in emscripten-core#11079 and emscripten-core#20154. So while resolving the conflicts, I effectively reverted emscripten-core#11079 and emscripten-core#20154 in `libcxx/src/stdlib_new_delete.cpp` and `libcxx/src/new.cpp`. - Add default `__assertion_handler` to `libcxx/include`: llvm/llvm-project#77883 adds a way for vendors to override how we handle assertions. If a vendor doesn't want to override it, CMake will copy the provided [default template assertion handler file](https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/vendor/llvm/default_assertion_handler.in) to libcxx's generated include dir, which defaults to `SYSROOT/include/c++/v1`: https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/CMakeLists.txt#L72-L73 https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/CMakeLists.txt#L439 https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/include/CMakeLists.txt#L1024 We don't use CMake, so this renames the provided `vendor/llvm/default_assertion_handler.in` to `__assert_handler` in our `libcxx/include` directory, which will be copied to `SYSROOT/include/c++/v1`. - Disable `__libcpp_verbose_abort directly` in code: In emscripten-core#20707 we decided to disable `__libcpp_verbose_abort` not to incur the code size increase that brings (it brings in `vfprintf` and its family). We disabled it in adding `#define _LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` in `__config_site`. But that `_NO_` macros are gone in LLVM 18.1.2, and changing it to `#define _LIBCPP_AVAILABILITY_HAS_VERBOSE_ABORT 0` does not work because it is overridden by this line, unless we provide our own vendor annotations: https://github.com/llvm/llvm-project/blob/38f5596feda3276a8aa64fc14e074334017088ca/libcxx/include/__availability#L138 I asked about this in llvm/llvm-project#87012 and llvm/llvm-project#71002 (comment) (and more comments below) but didn't get an actionable answer yet. So this disables `__libcpp_verbose_abort` in the code directly for now, hopefully temporarily. Each fix is described in its own commit, except for the fixes required to resolve conflicts when merging our changes, which I wasn't able to commit separately. Fixes emscripten-core#21603.
On top of the main library code changes, each fix that was necessary for Emscripten was committed separately with its own commit message.