-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[libc++] Rewrite the transitive header checking machinery #110554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc++] Rewrite the transitive header checking machinery #110554
Conversation
@llvm/pr-subscribers-libcxx Author: Louis Dionne (ldionne) ChangesFull diff: https://github.com/llvm/llvm-project/pull/110554.diff 4 Files Affected:
diff --git a/.github/workflows/libcxx-build-and-test.yaml b/.github/workflows/libcxx-build-and-test.yaml
index b5e60781e00064..831db85ca8e5c2 100644
--- a/.github/workflows/libcxx-build-and-test.yaml
+++ b/.github/workflows/libcxx-build-and-test.yaml
@@ -49,7 +49,9 @@ env:
jobs:
stage1:
if: github.repository_owner == 'llvm'
- runs-on: libcxx-runners-8-set
+ runs-on: ubuntu-latest
+ container:
+ image: ghcr.io/libcxx/actions-builder:fozzie-live
continue-on-error: false
strategy:
fail-fast: false
diff --git a/libcxx/test/libcxx/transitive_includes.gen.py b/libcxx/test/libcxx/transitive_includes.gen.py
index 22075364bf1b79..f9a868128c6c83 100644
--- a/libcxx/test/libcxx/transitive_includes.gen.py
+++ b/libcxx/test/libcxx/transitive_includes.gen.py
@@ -55,7 +55,7 @@
print(
f"""\
-// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes_to_csv.py {' '.join(all_traces)} > %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv
+// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/to_csv.py {' '.join(all_traces)} > %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv
"""
)
@@ -64,9 +64,6 @@
if header.endswith(".h"): # Skip C compatibility or detail headers
continue
- # Escape slashes for the awk command below
- escaped_header = header.replace("/", "\\/")
-
print(
f"""\
//--- {header}.sh.cpp
@@ -92,8 +89,8 @@
// RUN: mkdir %t
// RUN: %{{cxx}} %s %{{flags}} %{{compile_flags}} --trace-includes -fshow-skipped-includes --preprocess > /dev/null 2> %t/trace-includes.txt
-// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes_to_csv.py %t/trace-includes.txt > %t/actual_transitive_includes.csv
-// RUN: cat %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv | awk '/^{escaped_header} / {{ print }}' > %t/expected_transitive_includes.csv
+// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/to_csv.py %t/trace-includes.txt > %t/actual_transitive_includes.csv
+// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/expected.py %{{cxx_std}} "{header}" > %t/expected_transitive_includes.csv
// RUN: diff -w %t/expected_transitive_includes.csv %t/actual_transitive_includes.csv
#include <{header}>
"""
diff --git a/libcxx/test/libcxx/transitive_includes/expected.py b/libcxx/test/libcxx/transitive_includes/expected.py
new file mode 100755
index 00000000000000..b2519f36b3974b
--- /dev/null
+++ b/libcxx/test/libcxx/transitive_includes/expected.py
@@ -0,0 +1,35 @@
+#!/usr/bin/env python
+# ===----------------------------------------------------------------------===##
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+# ===----------------------------------------------------------------------===##
+
+import argparse
+import os
+
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser(
+ description="""Extract the list of expected transitive includes for the given Standard and header.""",
+ )
+ parser.add_argument(
+ "standard",
+ default=None,
+ choices=["cxx03", "cxx11", "cxx14", "cxx20", "cxx23", "cxx26"],
+ )
+ parser.add_argument(
+ "header",
+ default=None,
+ help="The header to extract the expected transitive includes for."
+ )
+ args = parser.parse_args()
+
+ CSV_ROOT = os.path.dirname(__file__)
+ filename = os.path.join(CSV_ROOT, f"{args.standard}.csv")
+ with open(filename, 'r') as f:
+ for line in f.readlines():
+ if line.startswith(args.header + ' '):
+ print(line, end='') # lines already end in newline
diff --git a/libcxx/test/libcxx/transitive_includes_to_csv.py b/libcxx/test/libcxx/transitive_includes/to_csv.py
similarity index 100%
rename from libcxx/test/libcxx/transitive_includes_to_csv.py
rename to libcxx/test/libcxx/transitive_includes/to_csv.py
|
@llvm/pr-subscribers-github-workflow Author: Louis Dionne (ldionne) ChangesFull diff: https://github.com/llvm/llvm-project/pull/110554.diff 4 Files Affected:
diff --git a/.github/workflows/libcxx-build-and-test.yaml b/.github/workflows/libcxx-build-and-test.yaml
index b5e60781e00064..831db85ca8e5c2 100644
--- a/.github/workflows/libcxx-build-and-test.yaml
+++ b/.github/workflows/libcxx-build-and-test.yaml
@@ -49,7 +49,9 @@ env:
jobs:
stage1:
if: github.repository_owner == 'llvm'
- runs-on: libcxx-runners-8-set
+ runs-on: ubuntu-latest
+ container:
+ image: ghcr.io/libcxx/actions-builder:fozzie-live
continue-on-error: false
strategy:
fail-fast: false
diff --git a/libcxx/test/libcxx/transitive_includes.gen.py b/libcxx/test/libcxx/transitive_includes.gen.py
index 22075364bf1b79..f9a868128c6c83 100644
--- a/libcxx/test/libcxx/transitive_includes.gen.py
+++ b/libcxx/test/libcxx/transitive_includes.gen.py
@@ -55,7 +55,7 @@
print(
f"""\
-// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes_to_csv.py {' '.join(all_traces)} > %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv
+// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/to_csv.py {' '.join(all_traces)} > %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv
"""
)
@@ -64,9 +64,6 @@
if header.endswith(".h"): # Skip C compatibility or detail headers
continue
- # Escape slashes for the awk command below
- escaped_header = header.replace("/", "\\/")
-
print(
f"""\
//--- {header}.sh.cpp
@@ -92,8 +89,8 @@
// RUN: mkdir %t
// RUN: %{{cxx}} %s %{{flags}} %{{compile_flags}} --trace-includes -fshow-skipped-includes --preprocess > /dev/null 2> %t/trace-includes.txt
-// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes_to_csv.py %t/trace-includes.txt > %t/actual_transitive_includes.csv
-// RUN: cat %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv | awk '/^{escaped_header} / {{ print }}' > %t/expected_transitive_includes.csv
+// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/to_csv.py %t/trace-includes.txt > %t/actual_transitive_includes.csv
+// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/expected.py %{{cxx_std}} "{header}" > %t/expected_transitive_includes.csv
// RUN: diff -w %t/expected_transitive_includes.csv %t/actual_transitive_includes.csv
#include <{header}>
"""
diff --git a/libcxx/test/libcxx/transitive_includes/expected.py b/libcxx/test/libcxx/transitive_includes/expected.py
new file mode 100755
index 00000000000000..b2519f36b3974b
--- /dev/null
+++ b/libcxx/test/libcxx/transitive_includes/expected.py
@@ -0,0 +1,35 @@
+#!/usr/bin/env python
+# ===----------------------------------------------------------------------===##
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+# ===----------------------------------------------------------------------===##
+
+import argparse
+import os
+
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser(
+ description="""Extract the list of expected transitive includes for the given Standard and header.""",
+ )
+ parser.add_argument(
+ "standard",
+ default=None,
+ choices=["cxx03", "cxx11", "cxx14", "cxx20", "cxx23", "cxx26"],
+ )
+ parser.add_argument(
+ "header",
+ default=None,
+ help="The header to extract the expected transitive includes for."
+ )
+ args = parser.parse_args()
+
+ CSV_ROOT = os.path.dirname(__file__)
+ filename = os.path.join(CSV_ROOT, f"{args.standard}.csv")
+ with open(filename, 'r') as f:
+ for line in f.readlines():
+ if line.startswith(args.header + ' '):
+ print(line, end='') # lines already end in newline
diff --git a/libcxx/test/libcxx/transitive_includes_to_csv.py b/libcxx/test/libcxx/transitive_includes/to_csv.py
similarity index 100%
rename from libcxx/test/libcxx/transitive_includes_to_csv.py
rename to libcxx/test/libcxx/transitive_includes/to_csv.py
|
You can test this locally with the following command:darker --check --diff -r a4367d2d136420f562f64e7731b9393fb609f3fc...c43371183ac3d0aef9f8fc66fdc1a4ec2d16deff libcxx/test/libcxx/transitive_includes/to_csv.py libcxx/test/libcxx/header_inclusions.gen.py libcxx/test/libcxx/headers_in_modulemap.sh.py libcxx/test/libcxx/transitive_includes.gen.py libcxx/utils/generate_iwyu_mapping.py libcxx/utils/generate_libcxx_cppm_in.py libcxx/utils/libcxx/header_information.py View the diff from darker here.--- test/libcxx/header_inclusions.gen.py 2024-10-16 23:54:58.000000 +0000
+++ test/libcxx/header_inclusions.gen.py 2024-10-17 00:07:22.691172 +0000
@@ -10,34 +10,43 @@
# prescribed by the Standard.
# RUN: %{python} %s %{libcxx-dir}/utils
import sys
+
sys.path.append(sys.argv[1])
-from libcxx.header_information import lit_header_restrictions, public_headers, mandatory_inclusions
+from libcxx.header_information import (
+ lit_header_restrictions,
+ public_headers,
+ mandatory_inclusions,
+)
for header in public_headers:
- header_guard = lambda h: f"_LIBCPP_{str(h).upper().replace('.', '_').replace('/', '_')}"
+ header_guard = (
+ lambda h: f"_LIBCPP_{str(h).upper().replace('.', '_').replace('/', '_')}"
+ )
- # <cassert> has no header guards
- if header == 'cassert':
- checks = ''
- else:
- checks = f'''
+ # <cassert> has no header guards
+ if header == "cassert":
+ checks = ""
+ else:
+ checks = f"""
#ifndef {header_guard(header)}
# error <{header}> was expected to define a header guard {header_guard(header)}
#endif
-'''
- for includee in mandatory_inclusions.get(header, []):
- checks += f'''
+"""
+ for includee in mandatory_inclusions.get(header, []):
+ checks += f"""
#ifndef {header_guard(includee)}
# error <{header}> was expected to include <{includee}>
#endif
-'''
+"""
- print(f"""\
+ print(
+ f"""\
//--- {header}.compile.pass.cpp
{lit_header_restrictions.get(header, '')}
#include <{header}>
{checks}
-""")
+"""
+ )
--- test/libcxx/headers_in_modulemap.sh.py 2024-10-16 23:59:35.000000 +0000
+++ test/libcxx/headers_in_modulemap.sh.py 2024-10-17 00:07:22.711775 +0000
@@ -1,8 +1,9 @@
# RUN: %{python} %s %{libcxx-dir}/utils
import sys
+
sys.path.append(sys.argv[1])
from libcxx.header_information import all_headers, libcxx_include
with open(libcxx_include / "module.modulemap") as f:
modulemap = f.read()
--- test/libcxx/transitive_includes/to_csv.py 2024-10-16 23:54:58.000000 +0000
+++ test/libcxx/transitive_includes/to_csv.py 2024-10-17 00:07:22.745144 +0000
@@ -14,13 +14,16 @@
import os
import pathlib
import re
import sys
-libcxx_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
+libcxx_root = os.path.dirname(
+ os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+)
sys.path.append(os.path.join(libcxx_root, "utils"))
from libcxx.header_information import Header
+
def parse_line(line: str) -> Tuple[int, str]:
"""
Parse a single line of --trace-includes output.
@@ -31,10 +34,11 @@
raise ArgumentError(f"Line {line} contains invalid data.")
# The number of periods in front of the header name is the nesting level of
# that header.
return (len(match.group(1)), match.group(2))
+
def make_cxx_v1_relative(header: str) -> Optional[str]:
"""
Returns the path of the header as relative to <whatever>/c++/v1, or None if the path
doesn't contain c++/v1.
@@ -51,10 +55,11 @@
match = re.match(CXX_V1_REGEX, header)
if not match:
return None
else:
return match.group(1)
+
def parse_file(file: io.TextIOBase) -> List[Tuple[Header, Header]]:
"""
Parse a file containing --trace-includes output to generate a list of the
transitive includes contained in it.
@@ -79,25 +84,27 @@
else:
assert includer is not None
result.append((includer, Header(relative)))
return result
+
def print_csv(includes: List[Tuple[Header, Header]]) -> None:
"""
Print the transitive includes as space-delimited CSV.
This function only prints public libc++ headers that are not C compatibility headers.
"""
# Sort and group by includer
by_includer = lambda t: t[0]
includes = itertools.groupby(sorted(includes, key=by_includer), key=by_includer)
- for (includer, includees) in includes:
+ for includer, includees in includes:
includees = map(lambda t: t[1], includees)
for h in sorted(set(includees)):
if h.is_public() and not h.is_C_compatibility():
print(f"{includer} {h}")
+
def main(argv):
parser = argparse.ArgumentParser(
description="""
Given a list of headers produced by --trace-includes, produce a list of libc++ headers in that output.
@@ -106,15 +113,22 @@
information for this script to run.
The output of this script is provided in space-delimited CSV format where each line contains:
<header performing inclusion> <header being included>
- """)
- parser.add_argument("inputs", type=argparse.FileType("r"), nargs='+', default=None,
- help="One or more files containing the result of --trace-includes")
+ """
+ )
+ parser.add_argument(
+ "inputs",
+ type=argparse.FileType("r"),
+ nargs="+",
+ default=None,
+ help="One or more files containing the result of --trace-includes",
+ )
args = parser.parse_args(argv)
includes = [line for file in args.inputs for line in parse_file(file)]
print_csv(includes)
+
if __name__ == "__main__":
main(sys.argv[1:])
--- utils/generate_libcxx_cppm_in.py 2024-10-16 23:54:58.000000 +0000
+++ utils/generate_libcxx_cppm_in.py 2024-10-17 00:07:22.841463 +0000
@@ -7,11 +7,17 @@
# ===----------------------------------------------------------------------===##
import os.path
import sys
-from libcxx.header_information import module_c_headers, module_headers, header_restrictions, headers_not_available, libcxx_root
+from libcxx.header_information import (
+ module_c_headers,
+ module_headers,
+ header_restrictions,
+ headers_not_available,
+ libcxx_root,
+)
def write_file(module):
with open(libcxx_root / "modules" / f"{module}.cppm.in", "w") as module_cpp_in:
module_cpp_in.write(
--- utils/libcxx/header_information.py 2024-10-17 00:02:30.000000 +0000
+++ utils/libcxx/header_information.py 2024-10-17 00:07:22.925490 +0000
@@ -6,22 +6,26 @@
#
# ===----------------------------------------------------------------------===##
import os, pathlib, functools
-libcxx_root = pathlib.Path(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
+libcxx_root = pathlib.Path(
+ os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+)
libcxx_include = libcxx_root / "include"
assert libcxx_root.exists()
+
def _is_header_file(file):
"""Returns whether the given file is a header file, i.e. not a directory or the modulemap file."""
return not file.is_dir() and not file.name in [
"module.modulemap",
"CMakeLists.txt",
"libcxx.imp",
"__config_site.in",
]
+
@functools.total_ordering
class Header:
_name: str
"""Relative path from the root of libcxx/include"""
@@ -91,11 +95,16 @@
These headers are all C++23-and-later headers, excluding C compatibility headers and
experimental headers.
"""
# These headers have been removed in C++20 so are never part of a module.
removed_in_20 = ["ccomplex", "ciso646", "cstdbool", "ctgmath"]
- return self.is_public() and not self.is_experimental() and not self.is_C_compatibility() and not self._name in removed_in_20
+ return (
+ self.is_public()
+ and not self.is_experimental()
+ and not self.is_C_compatibility()
+ and not self._name in removed_in_20
+ )
def is_in_modulemap(self) -> bool:
"""Returns whether a header should be listed in the modulemap."""
# TODO: Should `__config_site` be in the modulemap?
if self._name == "__config_site":
@@ -139,35 +148,47 @@
def __hash__(self) -> int:
return hash(self._name)
# Commonly-used sets of headers
-all_headers = [Header(p.relative_to(libcxx_include).as_posix()) for p in libcxx_include.rglob("[_a-z]*") if _is_header_file(p)]
-all_headers += [Header("__config_site"), Header("__assertion_handler")] # Headers generated during the build process
+all_headers = [
+ Header(p.relative_to(libcxx_include).as_posix())
+ for p in libcxx_include.rglob("[_a-z]*")
+ if _is_header_file(p)
+]
+all_headers += [
+ Header("__config_site"),
+ Header("__assertion_handler"),
+] # Headers generated during the build process
public_headers = [h for h in all_headers if h.is_public()]
module_headers = [h for h in all_headers if h.has_cxx20_module()]
module_c_headers = [h for h in all_headers if h.has_cxx20_module() and h.is_cstd()]
# These headers are not yet implemented in libc++
#
# These headers are required by the latest (draft) Standard but have not been
# implemented yet. They are used in the generated module input. The C++23 standard
# modules will fail to build if a header is added but this list is not updated.
-headers_not_available = list(map(Header, [
- "debugging",
- "flat_map",
- "flat_set",
- "generator",
- "hazard_pointer",
- "inplace_vector",
- "linalg",
- "rcu",
- "spanstream",
- "stacktrace",
- "stdfloat",
- "text_encoding",
-]))
+headers_not_available = list(
+ map(
+ Header,
+ [
+ "debugging",
+ "flat_map",
+ "flat_set",
+ "generator",
+ "hazard_pointer",
+ "inplace_vector",
+ "linalg",
+ "rcu",
+ "spanstream",
+ "stacktrace",
+ "stdfloat",
+ "text_encoding",
+ ],
+ )
+)
header_restrictions = {
# headers with #error directives
"atomic": "_LIBCPP_HAS_ATOMIC_HEADER",
"stdatomic.h": "_LIBCPP_HAS_ATOMIC_HEADER",
|
bbeecef
to
4e981c9
Compare
It looks like the issue has to do with Lit's builtin |
4e981c9
to
1f80ab9
Compare
So I don't think the diff tool is at fault here, because using On my local machine, I reproduced the issue by creating a EDIT: And |
Since we don't generate a full dependency graph of headers, we can greatly simplify the script that parses the result of --trace-includes. At the same time, we also unify the mechanism for detecting whether a header is a public/C compat/internal/etc header with the existing mechanism in header_information.py. As a drive-by this fixes the headers_in_modulemap.sh.py test which had been disabled by mistake because it used its own way of determining the list of libc++ headers. By consistently using header_information.py to get that information, problems like this shouldn't happen anymore.
1f80ab9
to
162bef9
Compare
I'll merge this to unblock a chain of dependencies getting longer and longer. I'm willing to take any post-review feedback and I can create a follow-up PR, but for now this fixes some concrete issues preventing us (amongst other things) to move to Docker-in-Docker and update the Docker images, which is blocking the clang-tidy tests re-enablement. |
Since we don't generate a full dependency graph of headers, we can
greatly simplify the script that parses the result of --trace-includes.
At the same time, we also unify the mechanism for detecting whether a
header is a public/C compat/internal/etc header with the existing
mechanism in header_information.py.
As a drive-by this fixes the headers_in_modulemap.sh.py test which had
been disabled by mistake because it used its own way of determining
the list of libc++ headers. By consistently using header_information.py
to get that information, problems like this shouldn't happen anymore.
This should also unblock #110303, which was blocked because of
a brittle implementation of the transitive includes check which broke
when the repository was cloned at a path like /path/__something/more.