Skip to content

[Bazel] Make LLVM and Clang config headers configurable #126729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aaronmondal
Copy link
Member

@aaronmondal aaronmondal commented Feb 11, 2025

This change implements the following flags and provides the
infrastructure for future additions:

bazel query 'kind(".*_flag", @llvm-project//config:all)'
@llvm-project//config:LLVM_ENABLE_BACKTRACES
@llvm-project//config:LLVM_ENABLE_CRASH_DUMPS
@llvm-project//config:LLVM_ENABLE_CRASH_OVERRIDES
@llvm-project//config:LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
@llvm-project//config:LLVM_ENABLE_PLUGINS
@llvm-project//config:LLVM_ENABLE_ZLIB
@llvm-project//config:LLVM_ENABLE_ZSTD
@llvm-project//config:LLVM_HAS_LOGF128
@llvm-project//config:LLVM_WINDOWS_PREFER_FORWARD_SLASH

For instance:

bazel build ... \
    --@llvm-project//config:LLVM_ENABLE_BACKTRACES=false

The following flags have been moved to mirror CMake and simplify
migration to bzlmod:

@llvm_zlib//:llvm_enable_zlib
-> @llvm-project//config:LLVM_ENABLE_ZLIB

@llvm_zstd//:llvm_enable_zstd
-> @llvm-project//config:LLVM_ENABLE_ZSTD

The overlay now has platform definitions at @llvm-project//platform.
You can use these to cross compile the config headers to see how they'd
look like on other systems. For instance:

bazel build @llvm-project//llvm:config_h \
    --platforms=@llvm-project//platform:aarch64-unknown-linux-gnu

bazel build @llvm-project//llvm:llvm-config_h \
    --platforms=@llvm-project//platform:x86_64-pc-win32

The new implementation uses the original CMake templates as source of
truth. This makes it easier to detect and prevent drift as unhandled
substitutions tend to turn into compiler errors.

Saves 16 defines on windows and ~35 defines on other platforms which
previously leaked into the command lines of all targets that depended
transitively on the llvm configuration. For a full build of the overlay
this amounts to ~380k fewer defines.

@llvmbot llvmbot added the bazel "Peripheral" support tier build system: utils/bazel label Feb 11, 2025
Copy link

github-actions bot commented Feb 11, 2025

✅ With the latest revision this PR passed the Python code formatter.

Copy link

github-actions bot commented Feb 11, 2025

✅ With the latest revision this PR passed the undef deprecator.

Copy link
Contributor

@chsigg chsigg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much cleaner, thank you very much!

Copy link
Contributor

@chapuni chapuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great at glance but let me take a look and confirm this weekend.

@aaronmondal aaronmondal changed the title [Bazel] Make LLVM config headers configurable [Bazel] Make LLVM and Clang config headers configurable Feb 14, 2025
@aaronmondal aaronmondal force-pushed the bazel-configure branch 2 times, most recently from c15640a to 74bf25d Compare February 14, 2025 12:39
@rupprecht
Copy link
Collaborator

We tried to do something like this a couple years ago and it landed as e2ee8bf, but was reverted as 1729660. I looked at the internal bug and it seems the need to revert no longer applies so much, but we never got around to relanding it.

It's honestly impressive you got this working in bazel at all, because the language is extremely limited for this kind of thing. AFAIK there isn't a way to have bazel itself open a file and process it, it can just do expand_template with a dict. That's why you have to manually specify ENABLE_BACKTRACES with the cmakedefine01 helper, but HAVE_BACKTRACE with the different cmakedefine helper; whereas with something that shells out to a script can just pass ["ENABLE_BACKTRACES=1", "HAVE_BACKTRACE=1"] and let the script take care of the difference between #cmakedefine01 and #cmakedefine in the config.h.cmake template.

By default, llvm/utils/gn/build/write_cmake_config.py is a reasonable first pick for such a script, but it doesn't need to be -- we could write a new one, or even cmake -P kinda works, if we can rely on cmake existing on the build host.

Copy link
Contributor

@chapuni chapuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've confirmed this works for me with a few fixes.

@MaskRay
Copy link
Member

MaskRay commented Feb 15, 2025

Making it configurable is definitely an improvement:) I am not familiar with Bazel.

I'd recommend getting explicit approval from @rupprecht.

@aaronmondal
Copy link
Member Author

whereas with something that shells out to a script can just pass ["ENABLE_BACKTRACES=1", "HAVE_BACKTRACE=1"] and let the script take care of the difference between #cmakedefine01 and #cmakedefine in the config.h.cmake template.

By default, llvm/utils/gn/build/write_cmake_config.py is a reasonable first pick for such a script, but it doesn't need to be -- we could write a new one, or even cmake -P kinda works, if we can rely on cmake existing on the build host.

I wonder whether it's feasible to automate the generation of the config.bzl file and all its selects via an offline script. I'm not sure how hard it would be to do so, but I'm thinking something like a script that runs cmake -P on a bunch of emulators. This could then be used as a diff (or should I say drift 😆) test for the config targets to keep it perfectly in sync.

For the actual config.bzl the issue with a python (or cmake) script is that it would increase the closure size of these targets by quite a bit.

At the moment the entire clang build can get by without python on a remote executor with this patch: https://github.com/eomii/rules_ll/blob/main/patches/llvm-project-bundle-with-bash.diff

If we used python (or cmake) we'd have to manage additional (potentially cross-)toolchains for all the various platforms. At some point (e.g when lit tests need to run) we'll need it, but it's nice if we can shift this to as late in the build graph as possible. This way the workers that usually only act as exec platform for cc_toolchains often get by without ever fetching python.

Another nice effect of deferring toolchain requirements in the build graph is that it reduces their "hotness" in a multi-layer cache systems. The less often we need to touch a toolchain artifact, the more space remains for the real build artifacts.

This change implements the following flags and provides the
infrastructure for future additions:

    bazel query 'kind(".*_flag", @llvm-project//config:all)'
    @llvm-project//config:LLVM_ENABLE_BACKTRACES
    @llvm-project//config:LLVM_ENABLE_CRASH_DUMPS
    @llvm-project//config:LLVM_ENABLE_CRASH_OVERRIDES
    @llvm-project//config:LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
    @llvm-project//config:LLVM_ENABLE_PLUGINS
    @llvm-project//config:LLVM_ENABLE_ZLIB
    @llvm-project//config:LLVM_ENABLE_ZSTD
    @llvm-project//config:LLVM_HAS_LOGF128
    @llvm-project//config:LLVM_WINDOWS_PREFER_FORWARD_SLASH

For instance:

    bazel build ... \
        --@llvm-project//config:LLVM_ENABLE_BACKTRACES=false

The following flags have been moved to mirror CMake and simplify
migration to bzlmod:

    @llvm_zlib//:llvm_enable_zlib
    -> @llvm-project//config:LLVM_ENABLE_ZLIB

    @llvm_zstd//:llvm_enable_zstd
    -> @llvm-project//config:LLVM_ENABLE_ZSTD

The overlay now has platform definitions at `@llvm-project//platform`.
You can use these to cross compile the config headers to see how they'd
look like on other systems. For instance:

    bazel build @llvm-project//llvm:config_h \
        --platforms=@llvm-project//platform:aarch64-unknown-linux-gnu

    bazel build @llvm-project//llvm:llvm-config_h \
        --platforms=@llvm-project//platform:x86_64-pc-windows

The new implementation uses the original CMake templates as source of
truth. This makes it easier to detect and prevent drift as unhandled
substitutions tend to turn into compiler errors.

Saves 16 defines on windows and ~35 defines on other platforms which
previously leaked into the command lines of all targets that depended
transitively on the llvm configuration. For a full build of the overlay
this amounts to ~380k fewer defines.
@maxbartel
Copy link
Contributor

@rupprecht I think people are waiting for another review from you, could you please to that? The bazelmod transition is also waiting on this PR

@nickdesaulniers
Copy link
Member

ping @rupprecht ; it seems that #88927 (#55924) is also hung up on this.

Also looks like it needs a rebase as there's a merge conflict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bazel "Peripheral" support tier build system: utils/bazel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants