Feature Request / Discussion - Allow multiple compile_commands.json / settings the target path to the generated compile_commands.json #48

molar · 2022-04-25T10:12:56Z

We have a rather large repo, which means the generation of the compile commands takes a very long time
We would like to split the compile_commands.json into multiple files, allowing us to better "re-generate" a subset of the compile_commands.json. As far as i can tell, multiple compile_commands.json is supported by clangd.

// packageA
refresh_compile_commands(
    name = "refresh_compile_commands",
    targets = {
        "//packageA/...": "",
    },
)

will currently overwrite the top-level one, so the suggestion is to allow a different path for the resulting compile_commands.json.

Thanks for your time on this.

The text was updated successfully, but these errors were encountered:

cpsauer · 2022-04-25T23:56:21Z

Hey, Morten! Thanks for your continued interest--and for your thoughtfulness about supporting these broader cases you're encountering.

Sure. Shouldn't be a problem. We can definitely, e.g., let people specify the path.

But before we do, I want to trace things back to the root problem in case there's a less manual way we can solve things for you.

It sounds like the root issue is speed. Is that right? That is, if it ran way faster, you wouldn't need (or want) multiple compile_commands.json? (I have some ideas about how we might be able to speed things up a whole lot by caching the include information between runs, and getting a chunk of it from Bazel. [More here, if curious.])

For my additional understanding/context, could I ask about how long it's taking? (minutes?) And what typically prompts you to rerun?

If you're down, could you also try adding return set() to the top of _get_headers in external/hedron_compile_commands/refresh.template.py and let me know if that runtime would be fast enough for you? (That gives us an upper bound on the gains from optimizing finding the include optimization.)

Cheers,
Chris

As self notes for later, if we do end up doing this with multiple compile_command.json:
Might also save a round trip :)

We'll need to check that we can easily specify multiple compile_commands.json files via, e.g. --compile-commands-dir. (You need to manually specify them to properly get out of tree headers, like system headers.) I'm not super up to speed on the multiple compile_commands.json interface just yet, since it was added after I started using clangd and I haven't needed it just yet.
We should talk about the interface. Would overriding the path be the best option, vs, like, generating it in a subdirectory? ^ Relates to the above, and depends on whether we'd be able to use files with names besides compile_commands.json.
- Another option would be partial regeneration and concatenation, but this is starting to look an awful lot like caching--which we should probably just do directly.

cpsauer · 2022-05-04T23:55:05Z

@molar, any more thoughts on this?

molar · 2022-05-11T09:07:30Z

Thank you for the patience,

I have collected some metrics on the two cases, one with the header scanning and one without as you suggested. With header scanning it takes 25 minutes. Without it takes 20 seconds. With header scanning there are around 22000 entries in the compile_commands,json, and without there are around 10000 entries.

We generate the compile_commands,json on every CI run, for usage with Sonarqube, and in that case 20 seconds is acceptable for us.

So as you also write, the root issue is actually the time it takes to generate the compile_commands.json.

For now i have a flag that allows me to "disable" the header scanning, and we can upstream that if you are interested.

Another solution could be to allow a pattern for the source files where include scanning is required? this would allow us to only pay in the cases where clangd uses the wrong settings?

cpsauer · 2022-05-12T00:29:31Z

Yeah! No problem. Thanks for testing--and for a great answer. I appreciate your thoroughness.

Does Sonarqube need/want headers?

If Sonarqube wants headers, I think the caching described above could get times way down towards that 20s mark, if you're otherwise building or have cache. LMK if we're in this case.

If Sonarqube doesn't want headers, #44 will add an exclude_headers toggle shortly, among some other goodies! Rather than kicking off a separate upstreaming effort, I'd love it if you'd take a quick peek at that PR and weigh in if you have ideas on how we could make things even better! Or just tell Dave that you're excited to use his work :)

cpsauer · 2022-05-20T03:45:34Z

And @alexander-born, I see your 👀, which I assume means you're watching :)
Are you having a similar need for speed?

[@molar, still curious about the above!]

alexander-born · 2022-05-20T05:28:23Z

I am happy once pr #44 gets merged. The pr of Dave reduces regeneration of the compile commands by a lot (ca 10-15 min down to 30s).

cpsauer · 2022-05-20T07:18:07Z

Sweet. You've probably already read but heads that you'll likely run into some (very annoying) issues with clangd when you edit headers if you turn the header finding off. Are you using this for clangd or for some other tool?

[Current version of #44 doesn't speed things up, I think, but I'll make sure it gets back there.]

I'm trying to feel out whether you guys want headers and therefore the fast/cached header finding--and whether either of you might be willing to help build it.

cpsauer · 2022-05-24T00:09:23Z

@alexander-born?
(Also, thanks for testing and weighing in on the PR!)

alexander-born · 2022-05-25T09:02:19Z

I am using clangd and haven't encountered any issues with pr #44 and

    exclude_headers = "all",
    exclude_external_sources = True,

but that's not to say there are no issue without header finding, just didn't encounter any.

cpsauer · 2022-05-26T05:10:16Z

Cool. Thanks for letting me know. Working on getting #44 landed.

Just so you're prepared and will recognize the cause of the issue if you see it: Heads that you may run into weirdness if/when clangd infers the wrong header, since it doesn't yet traverse the include graph (see clangd/clangd#123). Hence all the header finding efforts here. But if it works for you, I'm delighted, and it's really good for me to know that it's useable in some cases without header finding. Could I ask you to let me know if you do ever run into any issues there, just so I get a sense?

Also, I'll mark #44 as resolving for now, since #5 is the speed/caching issue. LMK @molar if that seems wrong to you and we can reverse course.

cpsauer · 2022-05-28T05:48:28Z

Closed by @44. @molar, hopefully that solves things for you. If not--or you still need the compiler unwrapping we discussed in #33, please lmk!

cpsauer · 2022-06-02T04:36:54Z

Hey all! Heads that I just landed code that makes header finding way faster. We now have caching for header file finding and for clang/gcc, piggyback on Bazel's dependency cache--in addition to the options to control header finding merged previously!

That should be a huge improvement in speed. Don't think this can be much faster, and it's now quite good.

(Though if there's a windows user reading this: We might be able to make piggyback Bazel's cache with MSVC too. See #49 (comment) if you're interested in helping.)

cpsauer linked a pull request May 26, 2022 that will close this issue

Add attributes to control header and external source inclusion #44

Merged

cpsauer closed this as completed in #44 May 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request / Discussion - Allow multiple compile_commands.json / settings the target path to the generated compile_commands.json #48

Feature Request / Discussion - Allow multiple compile_commands.json / settings the target path to the generated compile_commands.json #48

molar commented Apr 25, 2022

cpsauer commented Apr 25, 2022

Uh oh!

cpsauer commented May 4, 2022

Uh oh!

molar commented May 11, 2022

Uh oh!

cpsauer commented May 12, 2022

Uh oh!

cpsauer commented May 20, 2022

Uh oh!

alexander-born commented May 20, 2022

Uh oh!

cpsauer commented May 20, 2022

Uh oh!

cpsauer commented May 24, 2022

Uh oh!

alexander-born commented May 25, 2022 •

edited

Loading

Uh oh!

cpsauer commented May 26, 2022

Uh oh!

cpsauer commented May 28, 2022

Uh oh!

cpsauer commented Jun 2, 2022 •

edited

Loading

Uh oh!

Feature Request / Discussion - Allow multiple compile_commands.json / settings the target path to the generated compile_commands.json #48

Feature Request / Discussion - Allow multiple compile_commands.json / settings the target path to the generated compile_commands.json #48

Comments

molar commented Apr 25, 2022

cpsauer commented Apr 25, 2022

Uh oh!

cpsauer commented May 4, 2022

Uh oh!

molar commented May 11, 2022

Uh oh!

cpsauer commented May 12, 2022

Uh oh!

cpsauer commented May 20, 2022

Uh oh!

alexander-born commented May 20, 2022

Uh oh!

cpsauer commented May 20, 2022

Uh oh!

cpsauer commented May 24, 2022

Uh oh!

alexander-born commented May 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cpsauer commented May 26, 2022

Uh oh!

cpsauer commented May 28, 2022

Uh oh!

cpsauer commented Jun 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexander-born commented May 25, 2022 •

edited

Loading

cpsauer commented Jun 2, 2022 •

edited

Loading