Skip to content

Find LLVM distributions automatically. #471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 46 commits into from
May 11, 2025

Conversation

helly25
Copy link
Contributor

@helly25 helly25 commented Mar 12, 2025

See issue: #473

Instead of implementing a complex way of predicting the basename find them by common prefixes in the database we anyway have.

Rather then forcing maintainers to enter more complex information this keeps us at simply adding the versions as generated from the hasher script.

This should also ultimately put issue #464 and any future occurrence to rest.

The results of the new search based algorithm are generally be better. The algorithm removes a lot of the mistakes from the old implementation, is more correct in some cases and performs other new correct decisions. Yet, the new algorithm may not always be perfect, as it may have inherited some bad decisions of the old algorithm.

As the diff in toolchain/internal/llvm_distributions.golden.out.txt shows we only drop del: lines, which means we can find more distributions.

The diff in `` is harder to interpret. The diffs are best viewed side by side as both old and new file are supposed to have the same contents. At least their "keys" are in the same order. I use a custom diff program that is designed for this. It strips the actual error messages. When only looking at new errors (as shown below), then we see no difference:

bazel run //mbo/diff -- --regex_replace_lhs='/(.*)(ERROR).*/\1\2/' --regex_replace_rhs='/(.*)(ERROR).*/\1\2/' ~/llvm_distributions.golden.sel.orig.txt ~/llvm_distributions.golden.sel.find.txt --algorithm=direct | awk '{l=c;c=$0}/^+.*ERROR/{print l"\n"c}'

If we look at all differences, then we see some 782 former errors now resulting in a distribution. They now find the likely best choice.

Some of these can be improved. But imo it is not worth doing so.

The PR does not change anything regarding manually provided information. But some folks will be able to switch to automatic selection instead.

@helly25
Copy link
Contributor Author

helly25 commented Apr 17, 2025

@fmeum @jsharpe @rrbutani

Hi, what is the best way to discuss the following:

@helly25 helly25 force-pushed the feat/find_vs_predict_asename branch from 19b891c to 9229bdb Compare April 17, 2025 14:42
@fmeum
Copy link
Member

fmeum commented Apr 18, 2025

Could you open a discussion and share it in the #cc channel? I don't think I'm a heavy enough user to decide whether this is useful or not.

@helly25
Copy link
Contributor Author

helly25 commented Apr 19, 2025

Could you open a discussion and share it in the #cc channel? I don't think I'm a heavy enough user to decide whether this is useful or not.

See #485

@helly25 helly25 force-pushed the feat/find_vs_predict_asename branch 4 times, most recently from 1e79366 to 29e4f1d Compare April 27, 2025 09:18
@helly25 helly25 changed the title Find V19+ basenames automatically. Find LLVM distributions automatically. Apr 28, 2025
@helly25 helly25 force-pushed the feat/find_vs_predict_asename branch from 0dcd26e to f21e56f Compare May 10, 2025 12:53
helly25 added 14 commits May 10, 2025 14:24
… them by common prefixes in the database we anyway have.

Rather then forcing maintainers to enter more complex information this keeps us at simply adding the versions as generated from the hasher script.
Add the missing distributions starting with 18.1.8.
Exclude old style windows distributions if new style ones are present as well.
Add a test for the distribution finder.
The test computes what would be found given all versions a nd a combination of arch and os.
Then the test compares the result to what is configured and shows the difference.
If the difference is empty then the finder is correct.
There are currently only deleted entries with the last deleted enty in version 18.1.0.
In other words starting with llvm version 18.1.1 the finder is correct.
That also means as long as this is true, when someone adds information for a new version then the test does not need to be updated.
Explicitly switch between requiring to find a distribution vs predicting.
For windows we automatically deduplicate becasue some versions have both LLVM and clang+llvm distributions.
Extend finding check to versions 10+.
@helly25 helly25 force-pushed the feat/find_vs_predict_asename branch from f21e56f to 586b2f4 Compare May 10, 2025 15:37
@helly25 helly25 closed this May 10, 2025
@helly25 helly25 reopened this May 10, 2025
@helly25
Copy link
Contributor Author

helly25 commented May 11, 2025

@fmeum thanks for the approval. I committed your suggestion. Once this is merged I will attempt to simplify things a bit and maybe add or remove a few choices as appropriate.

@fmeum fmeum merged commit 2fdfdd5 into bazel-contrib:master May 11, 2025
36 checks passed
@helly25 helly25 deleted the feat/find_vs_predict_asename branch May 11, 2025 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants