Skip to content

Update libcxx & libcxxabi to LLVM 17.0.4 #20707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Nov 17, 2023

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Nov 14, 2023

On top of the main library code changes, each fix that was necessary for Emscripten was committed separately with its own commit message.

@aheejin aheejin requested a review from sbc100 November 14, 2023 05:06
PSTL was integrated in libc++ recently and we have to choose one out of
three PSTL modes in many places, one of them being
https://github.com/llvm/llvm-project/blob/373c343a77a7afaa07179db1754a52b620dfaf2e/libcxx/include/__algorithm/pstl_backends/cpu_backends/backend.h#L15-L23.

This is supposed to set in `__config_site`, which is supposed to be
generated from
https://github.com/llvm/llvm-project/blob/main/libcxx/include/__config_site.in#L33-L36
from
https://github.com/llvm/llvm-project/blob/main/libcxx/CMakeLists.txt.
Given that we don't use CMake system in Emscripten, we need to hard-code
this in this file. Note that this cannot be solved by just adding
`-D_LIBCPP_PSTL_CPU_BACKEND_SERIAL` in `libcxx`'s `system_lib.py` flags,
because this is needed in header files that are included in user code
as well.
`libdispatch.cpp` is related to PSTL's `LIBDISPATCH` mode, whatever that
is, which doesn't seem to be necessary for us.
`_LIBCPP_NO_EXCEPTIONS` has changed to `_LIBCPP_HAS_NO_EXCEPTIONS` in
LLVM 17, so this fixes it in Emscripten-downstream code.
The wasm code size in
`other.test_minimal_runtime_code_Size_hello_embind_val` increased by 50%
in LLVM 17 lib update, and the reason was because `__throw_***`
functions in `libcxx/include/stdexcept` started using
`_LIBCPP_VERBOSE_ABORT` in non-exception mode:
https://github.com/llvm/llvm-project/blob/beb121f6322be466384ffaa68b913f58862ed14c/libcxx/include/stdexcept#L227-L305

Note that they used to just call abort() until LLVM 16:
https://github.com/llvm/llvm-project/blob/7cbf1a2591520c2491aa35339f227775f4d3adf6/libcxx/include/stdexcept#L220-L306

And that `_LIBCPP_VERBOSE_ABORT` calls `__libcpp_verbose_abort`, in case
`_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` is NOT defined:
https://github.com/llvm/llvm-project/blob/beb121f6322be466384ffaa68b913f58862ed14c/libcxx/include/__verbose_abort#L53

`__libcpp_verbose_abort` is this:
https://github.com/llvm/llvm-project/blob/beb121f6322be466384ffaa68b913f58862ed14c/libcxx/src/verbose_abort.cpp#L31-L75
which calls `vfprintf` and all that jazz, increasing code size a lot.
This `__libcpp_verbose_abort` apparently also existed in LLVM 16, but it
didn't seem to be used then.

So at the moment we define `_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT`
so that we don't use the new verbose version of printing.
@aheejin aheejin force-pushed the llvm_17_libcxx_libcxxabi branch from 7e99df2 to 5524101 Compare November 14, 2023 05:25
@sbc100
Copy link
Collaborator

sbc100 commented Nov 14, 2023

On top of the main library code changes, each fix that was necessary for Emscripten was committed separately with its own commit message.

Awesome! That makes it really easy to see what changed. Thanks for doing that.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@@ -1599,7 +1599,8 @@ class libcxx(NoExceptLibrary, MTLibrary):
'locale_win32.cpp',
'thread_win32.cpp',
'support.cpp',
'int128_builtins.cpp'
'int128_builtins.cpp',
'libdispatch.cpp'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a tailing comma so next time we add a single file its just a one line diff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1 +1 @@
51616
60220
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a pretty big bump that might be worth investigating. Is print being pulled in where it wasn't before for some reason?

There are a few useful linking command line tricks for tracking down why symbols get included. For example --trace-symbol and maybe -Wl,-Map=.

Copy link
Member Author

@aheejin aheejin Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this also turned out to be the __libcpp_verbose_abort issue. We thought adding -D_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT in libcxx's system_libs.py would solve the issue, but those routines are in the headers. So these routines can be called whenever a user gives errorneous arguments to string or vector allocations, so basically any user code can pull these __libcpp_verbose_abort in. In this case libwasmfs code pulled in __libcpp_verbose_abort and thus the printf family. We can add -D_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT to libwasmfs's cflags in system_libs.py but that's just a band-aid for libwasmfs because we can't do that for all user code.

I guess we should just define it in https://github.com/emscripten-core/emscripten/blob/main/system/lib/libcxx/include/__verbose_abort. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping 😀

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about defining it system/lib/libcxx/include/__config_site

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1581,6 +1581,8 @@ class libcxx(NoExceptLibrary, MTLibrary):
'-DLIBCXX_BUILDING_LIBCXXABI=1',
'-D_LIBCPP_BUILDING_LIBRARY',
'-D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS',
# This increases too much code size
'-D_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be removed now that its in the config_site?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, forgot to do that.

@aheejin aheejin merged commit 97f37e1 into emscripten-core:main Nov 17, 2023
@aheejin aheejin deleted the llvm_17_libcxx_libcxxabi branch November 17, 2023 04:13
aheejin added a commit to aheejin/emscripten that referenced this pull request Mar 29, 2024
In emscripten-core#20707 we decided to disable `__libcpp_verbose_abort` not to incur
the code size increase that brings (it brings in `vfprintf` and its
family). We disabled it in adding
`#define _LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` in `__config_site`.

That `_NO_` macros are gone in LLVM 18.1.2, so I changed it to
`#define _LIBCPP_AVAILABILITY_HAS_VERBOSE_ABORT 0`
emscripten-core@2bd7c67
But apparently, when including `__availability`, that setting is
overridden by this line:
https://github.com/llvm/llvm-project/blob/38f5596feda3276a8aa64fc14e074334017088ca/libcxx/include/__availability#L138
because we don't define our own vendor availability annotations
https://github.com/emscripten-core/emscripten/blob/193b2bff980529dbf5d067f01761ed8a6e41189f/system/lib/libcxx/include/__config_site#L4

I asked about this in
llvm/llvm-project#87012 and
llvm/llvm-project#71002 (comment)
(and more comments below)
but didn't get an actionable answer yet.

This disables `__libcpp_verbose_abort` in the code directly for now,
hopefully temporarily.
aheejin added a commit that referenced this pull request Apr 3, 2024
This updates libcxx and libcxxabi to LLVM 18.1.2. Due to
llvm/llvm-project#65534 we need to update both
at the same time.

Things to note:

- Disable hardening mode:
  LLVM 18.1.2 adds support for "Hardening Modes" to libcxx
  (https://libcxx.llvm.org/Hardening.html). There are four modes: none,
  fast, extensive and debug. Different hardening modes make different
  trade-offs between the amount of checking and runtime performance. We
  for now disable it (i.e., set it to 'none') so that we don't have any
  effects on the performance. We can consider enabling it when we ever
  get to enable the debug version of libcxx.

- Disable C++20 time zone support:
  LLVM 18.1.2 adds C++20 time zone support
  (https://libcxx.llvm.org/DesignDocs/TimeZone.html) to libcxx, which
  requires access IANA Time Zone Database. Currently it seems it only
  supports Linux:
  https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/src/tz.cpp#L45-L49
  So this excludes the two source files from build (which is done via
  `CMakeLists.txt` in the upstream LLVM) and sets
  `_LIBCPP_HAS_NO_TIME_ZONE_DATABASE` macro in `__config_site`. In
  future maybe we can consider implementing this in JS.

- `__cxa_init_primary_exception` support:
  llvm/llvm-project#65534 introduces
  `__cxa_init_primary_exception` and uses this in libcxx. Like several
  other methods like the below in `cxa_exception.cpp`,
  https://github.com/emscripten-core/emscripten/blob/fbdd9249e939660cb0d20f00d6bc1897c2f3905e/system/lib/libcxxabi/src/cxa_exception.cpp#L264-L269
  this new function takes a pointer to a destructor, and in Wasm
  destructor returns a `this` pointer, which requires `ifdef
  __USING_WASM_EXCEPTION__` on its signature. And that it is also used
  in libcxx means we need to guard some code in libcxx with `ifdef
  __USING_WASM_EXCEPTION__` for the signature difference, and we need to
  build libcxx with `-D__USING_WASM_EXCEPTIONS__` when Wasm EH is used.
  Also we add Emscripte EH's counterpart in
  `cxa_exception_emscripten.cpp` too.

- This version fixes long-running misbehaviors of `operator new`
  llvm/llvm-project#69498 seems to fix some
  long-running misbhaviors of `operator new`, which in emscripten we
  tried to fix in #11079 and #20154. So while resolving the conflicts, I
  effectively reverted #11079 and #20154 in
  `libcxx/src/stdlib_new_delete.cpp` and `libcxx/src/new.cpp`.

- Add default `__assertion_handler` to `libcxx/include`:
  llvm/llvm-project#77883 adds a way for vendors
  to override how we handle assertions. If a vendor doesn't want to
  override it, CMake will copy the provided [default template assertion
  handler
  file](https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/vendor/llvm/default_assertion_handler.in)
  to libcxx's generated include dir, which defaults to
  `SYSROOT/include/c++/v1`:
  https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/CMakeLists.txt#L72-L73
  https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/CMakeLists.txt#L439
  https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/include/CMakeLists.txt#L1024
  We don't use CMake, so this renames the provided
  `vendor/llvm/default_assertion_handler.in` to `__assert_handler` in
  our `libcxx/include` directory, which will be copied to
  `SYSROOT/include/c++/v1`.

- Disable `__libcpp_verbose_abort directly` in code:
  In #20707 we decided to disable `__libcpp_verbose_abort` not to incur
  the code size increase that brings (it brings in `vfprintf` and its
  family). We disabled it in adding
  `#define _LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` in
  `__config_site`. But that `_NO_` macros are gone in LLVM 18.1.2, and
  changing it to `#define _LIBCPP_AVAILABILITY_HAS_VERBOSE_ABORT 0` does
  not work because it is overridden by this line, unless we provide our
  own vendor annotations:
  https://github.com/llvm/llvm-project/blob/38f5596feda3276a8aa64fc14e074334017088ca/libcxx/include/__availability#L138
  I asked about this in
  llvm/llvm-project#87012 and
  llvm/llvm-project#71002 (comment)
  (and more comments below)
  but didn't get an actionable answer yet. So this disables
  `__libcpp_verbose_abort` in the code directly for now, hopefully
  temporarily.

Each fix is described in its own commit, except for the fixes required
to resolve conflicts when merging our changes, which I wasn't able to
commit separately.

Fixes #21603.
impact-maker pushed a commit to impact-maker/emscripten that referenced this pull request Apr 11, 2024
This updates libcxx and libcxxabi to LLVM 18.1.2. Due to
llvm/llvm-project#65534 we need to update both
at the same time.

Things to note:

- Disable hardening mode:
  LLVM 18.1.2 adds support for "Hardening Modes" to libcxx
  (https://libcxx.llvm.org/Hardening.html). There are four modes: none,
  fast, extensive and debug. Different hardening modes make different
  trade-offs between the amount of checking and runtime performance. We
  for now disable it (i.e., set it to 'none') so that we don't have any
  effects on the performance. We can consider enabling it when we ever
  get to enable the debug version of libcxx.

- Disable C++20 time zone support:
  LLVM 18.1.2 adds C++20 time zone support
  (https://libcxx.llvm.org/DesignDocs/TimeZone.html) to libcxx, which
  requires access IANA Time Zone Database. Currently it seems it only
  supports Linux:
  https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/src/tz.cpp#L45-L49
  So this excludes the two source files from build (which is done via
  `CMakeLists.txt` in the upstream LLVM) and sets
  `_LIBCPP_HAS_NO_TIME_ZONE_DATABASE` macro in `__config_site`. In
  future maybe we can consider implementing this in JS.

- `__cxa_init_primary_exception` support:
  llvm/llvm-project#65534 introduces
  `__cxa_init_primary_exception` and uses this in libcxx. Like several
  other methods like the below in `cxa_exception.cpp`,
  https://github.com/emscripten-core/emscripten/blob/fbdd9249e939660cb0d20f00d6bc1897c2f3905e/system/lib/libcxxabi/src/cxa_exception.cpp#L264-L269
  this new function takes a pointer to a destructor, and in Wasm
  destructor returns a `this` pointer, which requires `ifdef
  __USING_WASM_EXCEPTION__` on its signature. And that it is also used
  in libcxx means we need to guard some code in libcxx with `ifdef
  __USING_WASM_EXCEPTION__` for the signature difference, and we need to
  build libcxx with `-D__USING_WASM_EXCEPTIONS__` when Wasm EH is used.
  Also we add Emscripte EH's counterpart in
  `cxa_exception_emscripten.cpp` too.

- This version fixes long-running misbehaviors of `operator new`
  llvm/llvm-project#69498 seems to fix some
  long-running misbhaviors of `operator new`, which in emscripten we
  tried to fix in emscripten-core#11079 and emscripten-core#20154. So while resolving the conflicts, I
  effectively reverted emscripten-core#11079 and emscripten-core#20154 in
  `libcxx/src/stdlib_new_delete.cpp` and `libcxx/src/new.cpp`.

- Add default `__assertion_handler` to `libcxx/include`:
  llvm/llvm-project#77883 adds a way for vendors
  to override how we handle assertions. If a vendor doesn't want to
  override it, CMake will copy the provided [default template assertion
  handler
  file](https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/vendor/llvm/default_assertion_handler.in)
  to libcxx's generated include dir, which defaults to
  `SYSROOT/include/c++/v1`:
  https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/CMakeLists.txt#L72-L73
  https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/CMakeLists.txt#L439
  https://github.com/llvm/llvm-project/blob/26a1d6601d727a96f4301d0d8647b5a42760ae0c/libcxx/include/CMakeLists.txt#L1024
  We don't use CMake, so this renames the provided
  `vendor/llvm/default_assertion_handler.in` to `__assert_handler` in
  our `libcxx/include` directory, which will be copied to
  `SYSROOT/include/c++/v1`.

- Disable `__libcpp_verbose_abort directly` in code:
  In emscripten-core#20707 we decided to disable `__libcpp_verbose_abort` not to incur
  the code size increase that brings (it brings in `vfprintf` and its
  family). We disabled it in adding
  `#define _LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT` in
  `__config_site`. But that `_NO_` macros are gone in LLVM 18.1.2, and
  changing it to `#define _LIBCPP_AVAILABILITY_HAS_VERBOSE_ABORT 0` does
  not work because it is overridden by this line, unless we provide our
  own vendor annotations:
  https://github.com/llvm/llvm-project/blob/38f5596feda3276a8aa64fc14e074334017088ca/libcxx/include/__availability#L138
  I asked about this in
  llvm/llvm-project#87012 and
  llvm/llvm-project#71002 (comment)
  (and more comments below)
  but didn't get an actionable answer yet. So this disables
  `__libcpp_verbose_abort` in the code directly for now, hopefully
  temporarily.

Each fix is described in its own commit, except for the fixes required
to resolve conflicts when merging our changes, which I wasn't able to
commit separately.

Fixes emscripten-core#21603.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants