-
Notifications
You must be signed in to change notification settings - Fork 146
The extractor sometimes picks a non-package-local .cc file's command for header files #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey, @matta! Thanks as always for your concern and careful eye. I definitely know what's happening here; we're currently just picking a command arbitrarily among those used to compile each header.Recall that the header-command-finding mechanism is working by determining all the headers used in compiling a given source file. (This includes transitive inclusion.) Since headers aren't otherwise compiled, reversing the mapping lets us trace all the compile commands that parse each header! We're then just picking one (arbitrary) command for each header, among the many that might compile it--and emitting that that command for the header in compile_commands.json. (Once upon a time, we used to include all commands for each header, but this blows up the compile_commands.json file size, since some headers are included numerous times. See #6. This was blocking some people, so I fixed it in 084957e) Clangd just picked an arbitrary one anyway, so, faster, smaller output, no functionality changed for users. There's no logic here like currently like "prefer the command for the closest source file" -- on the theory that we just needed one representative command, used to compile the file in the targets specified. FWIW, there may indeed be no close source file. Think templates, header-only libraries, or prebuilt libraries. Assuming the right source is nearby is what gets clangd's current heuristics in trouble, requiring this whole header workaround. compile_commands.json wasn't originally intended to contain source files in the first place! I'm very much hoping that clangd will fix the underlying issue relatively soon, and then we can put this necessary but fairly yucky workaround to rest. Anyway, it sounds like the crux of your point is that picking the nearest file's command is most likely to be correct, so we should pick it instead of an arbitrary one. I get it, and it hadn't occurred to me before. Though like hopefully the point of the test suite is that they all build and run? More generally, it seems like what we really want is preferring ones where the command is in good shape, but it's not quite clear to me the best way to detect that. I should say that with our current solution, we'll at least (gracefully) fail to find headers when the command is sufficiently messed up, preventing those commands from being emitted for headers. If you feel strongly that borked sub targets are a common enough case that we should defend against them, I'm definitely open to a fix that uses a heuristic to pick among header commands, rather than just taking whichever one happens to have come first. Code would mesh in parallel to the commit listed above that picks arbitrarily. Happy to help orient you more if you'd like more than that! Also happy to accept incremental heuristics as you think of them or encounter cases. I should also say that I don't think this is currently on clangd's radar, so if you feel strongly about the importance of this, you might want to chime in on their issue. That way the behavior you like will hang around after we move off the workaround! All the best, P.S. The more we write, the more I'm getting the sense you might be a current or former Googler, too. Is that right? |
I think picking a "nearby" source file for each header is likely to make good sense in most cases, but clangd/clangd#123 (comment) makes an interesting point:
Related is Sam's admission here: clangd/clangd#45 (comment). In most code in Google even header files are "self contained" (https://google.github.io/styleguide/cppguide.html#Self_contained_Headers), and command like Hm, might it work to support that expression of compile_commands.json, for code bases that enforce this style rule? P.S. Yep, worked at Google for 13 years on all sorts of stuff, most recently the C++ library team, left 2.5 years ago. When I joined Chrome, iOS, Android, and Bazel did not exist, the Google BUILD files were just Python, and the build system spat out GNU Makefiles. :-) |
:) Yep, agreed! And cool stuff! Curious what you're up to these days--but I'll connect separately. |
Just landed Lukas' big Windows-prep refactor, so the merge-conflict coast is clear if you want to take a shot at this! |
Saw os.path.commonprefix while landing Windows support and thought back about this. Has this continued to be an issue @matta? |
@matta, I'm tempted to close this, unless it's continuing to be an problem for you and/or worth your tossing up a quick PR to fix |
Sorry I missed this the first time around. Feel free to close as I haven’t
been using Bazel recently.
…On Thu, May 19, 2022 at 8:53 PM Christopher Sauer ***@***.***> wrote:
@matta <https://github.com/matta>, I'm tempted to close this, unless it's
continuing to be an problem for you and/or worth your tossing up a quick PR
to fix
—
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAJDQVWXA276TGSWTS65ADVK4ECXANCNFSM5NYNOIEA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
.com>
|
Will do. Curious what you've moved on to! If anyone else reading this later has the same need--holler and we'll reopen. |
I'm digging into this now, mostly because we also use compile_commands.json to drive a clang-tidy job, and while everything builds fine, the hedron stuff is picking exemplar cpp files which are actually generated ones (e.g. cython .cxx) that we do not care are not tidy. So I'm trying to find the spot in refresh.template.py where I can add an exclusion pattern to the source files that it chooses from when picking some arbitrary source file to drive the compilation of a given header.... pointers appreciated! |
@cpsauer could you point me to the spot in the python which would be where one would put these heuristics in? I could do what I need there, I think... |
Okay I found something that works, I think - in my branch, I added another parameter which contains a list of source file extensions to treat as ineligible for header file command generation, exactly as is done hard-coded for assembly stuff here https://github.com/hedronvision/bazel-compile-commands-extractor/blob/main/refresh.template.py#L670 PS - I am happy to update my PR with these various changes but I haven't seen any traction on it so for now I'm just carrying on in our internal mirror with this stuff. |
Uh oh!
There was an error while loading. Please reload this page.
In most cases the commands extractor seems to generate compile commands that builds the "most closely related"
.cc
file with each header. E.g. a compile command for "a/thing.h
" would actually build "a/thing.cc
", and those are usually paired in a singlecc_library
rule in the//a
package. This is unsurprising and intuitive; if I wanted to check ifa/thing.h
compiled I'd probably build//a:thing
. This is also whatbazel build --compile_one_dependency a/thing.h
would do.I have noticed that sometimes the bazel-compile-commands-extractor does something else -- it will pick a
.cc
file in some other package, when a.cc
file in the same package is present and would be a better choice.This was confusing to me in one situation, where I was changing an API and knew most of my project would be temporarily broken. However, I had fixed the local code, so (in the terms of the example above)
bazel build //a/...
worked andbazel test //a/...
passed, but my clangd fora/thing.h
was in a borked state because some other package in my project was not building. I can't remember exactly, but I may have also tried to runclang-tidy thing.h
and had it fail for the same reason.Here is one example:
I have a
cc_library
rule:But my
compile_commands.json
uses a.cxx
file from some other package when building"base/test.hxx"
:Here is the rule with
wavl/tree_test.cxx
:Bazel has some idea that
base/test.cxx
is a better choice, becausebazel build --subcommands --compile_one_dependency base/test.hxx
picks the//base:test
target:The text was updated successfully, but these errors were encountered: