Skip to content

Conversation

SLTozer
Copy link
Contributor

@SLTozer SLTozer commented Jul 8, 2025

For the purposes of setting up CI that makes use of debugify, this patch adds an alternative mode for the llvm-original-di-preservation.py script, which produces terminal-friendly(-ish) YAML output instead of an HTML report, and sets the return code to 1 if the input file contains errors, or 0 if the input file contains no errors or does not exist, making it simple to use it in CI.

This introduces a small change in existing usage, in that the path for the HTML report file is now passed with --report-file <path> rather than as a positional argument; I could make the argparse logic work without this change, but I believe that is simpler to understand this way, and to my knowledge debugify isn't currently being used in automated environments where changing this might cause issues. As a small change while passing by, I also changed -compress to --compress, for consistency.

As a note for reviewers, the reason that we treat a non-existent input file as a pass is that this is actually the expected state: we use clang to compile numerous files, passing a filepath for debugify errors. Any errors found by debugify will be written to this file; if none are found, the file is untouched. This is also mentioned in a code comment, but I think it useful to state upfront.

Finally, the justification for adding a new mode to this script instead of adding a separate script for the separate functionality is that this script understands debugify's output, and performs some deduplication that is useful for clarifying the resulting output. Writing a new script would require duplicating logic unnecessarily, and risks the scripts falling out-of-sync if changes are made to debugify's output.

For the purposes of setting up CI that makes use of debugify, this patch
adds an alternative flag for the llvm-original-di-preservation.py script,
which produces terminal-friendly output instead of an HTML report, and sets
the return code to 1 if the input file contains errors, or 0 if the input
file contains no errors or does not exist, making it simple to use it in
CI.

This introduces a small change in existing usage, in that the path for the
HTML report file is now passed with `--report-file <path>` rather than as a
positional argument; I could make the argparse logic work without this
change, but I believe that is simpler to understand this way, and to my
knowledge debugify isn't currently being used in automated environments
where changing this might cause issues.

The reason that we treat a non-existent input file as a pass is that this is
actually the expected state: we use clang to compile numerous files, passing
a filepath for debugify errors. Any errors found by debugify will be written
to this file; if none are found, the file is untouched.

The reason that we use this script at all, rather than using a separate
script for what is largely a separate purpose, is that this script
understands debugify's output, and performs some deduplication that is
useful for clarifying the resulting output. Writing a new script would
require duplicating logic unnecessarily.
Copy link

github-actions bot commented Jul 9, 2025

✅ With the latest revision this PR passed the Python code formatter.

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general; I'm generally unfamiliar with this script though, so I'd much prefer if someone else could review it for approval.

Comment on lines +32 to +34
if self.origin:
result["origin"] = self.origin
return result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No test coverage for this property ("origin")?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in latest commit.

"action": self.action,
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this becomes load-bearing, at some point we're going to have to switch over to using the "proper" yaml dumping and pretty-printing facilities. Does it make sense to do that now instead? (I don't think we need to make an all-singing-all-dancing implementation for all scripts, we can just focus on their narrow purpose).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally started using "proper" YAML, but this does require an extra package. Currently this (and most other utility scripts) can be run with just python, with no install commands necessary, and I consider preserving this to be a benefit worth some cost. My expectation is, this script has very straightforward and limited functionality, so there isn't too much to gain from using a proper YAML package - only very basic printing logic is required, and the YAML dumping segment is purely for user benefit (it has no expectation of being programmatically parsed).

parser.add_argument("html_file", type=str, help="html file to output data")
parser.add_argument(
"-compress", action="store_true", help="create reduced html report"
parser.add_argument("--compress", action="store_true", help="create reduced report")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel "short" or "reduced" or "summary" would be better for the option name, as "compress" makes me (and others?) think about file compression. YMMV, but if it's not too hard a change IMO it'd be better to not use "compress".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't wholly disagree, but this is a pre-existing flag so I'd prefer not to change it too much in this PR; the only reason for the change here is consistency with the other flags using -- instead of -.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think again, probably "reduce" is better match here.

Comment on lines 504 to 512
if opts.error_test:
if os.path.isdir(opts.file_name):
print(f"error: Directory passed as input file: '{opts.file_name}'")
sys.exit(1)
if not os.path.exists(opts.file_name):
# We treat an empty input file as a success, as debugify will generate an output file iff any errors are
# found, meaning we expect 0 errors to mean that the expected file does not exist.
print(f"No errors detected for: {opts.file_name}")
sys.exit(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed -- a switch enables code-paths to be tested that otherwise are never run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --error-test switch means that we're checking to see whether the input file contains any errors - this script will normally simply fail with 1 if we don't have a valid JSON file as input, but in this specific case we want to pass if given the path to a non-existent file because that's what we expect in the case where there are no errors (as Clang will never open the results file to write to it).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does make sense, but I am having problems with the option name. I would rather go with something more intuitive, e.g. --allow-missing-results?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it could be more clear - it does do more than just allow missing results though, as this is the toggle for a different "mode" of the script. I'll come up with something slightly more descriptive shortly!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone for "acceptance test", since that more accurately describes the purpose imo.

@SLTozer
Copy link
Contributor Author

SLTozer commented Jul 16, 2025

I suspect if anyone is well-equipped to review this it would be @djtodoro - otherwise, I think this is a relatively "low-stakes" change.

Copy link
Collaborator

@djtodoro djtodoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SLTozer thanks for this, it does make sense!

parser.add_argument("html_file", type=str, help="html file to output data")
parser.add_argument(
"-compress", action="store_true", help="create reduced html report"
parser.add_argument("--compress", action="store_true", help="create reduced report")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think again, probably "reduce" is better match here.


report_type_group = parser.add_mutually_exclusive_group(required=True)
report_type_group.add_argument(
"--report-file", type=str, help="output HTML file for the generated report"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use --report-html-file instead?

report_type_group.add_argument(
"--error-test",
action="store_true",
help="if set, produce terminal-friendly output and return 0 iff the input file is empty or does not exist",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it does make sense.

Comment on lines 504 to 512
if opts.error_test:
if os.path.isdir(opts.file_name):
print(f"error: Directory passed as input file: '{opts.file_name}'")
sys.exit(1)
if not os.path.exists(opts.file_name):
# We treat an empty input file as a success, as debugify will generate an output file iff any errors are
# found, meaning we expect 0 errors to mean that the expected file does not exist.
print(f"No errors detected for: {opts.file_name}")
sys.exit(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does make sense, but I am having problems with the option name. I would rather go with something more intuitive, e.g. --allow-missing-results?

Copy link
Collaborator

@djtodoro djtodoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SLTozer SLTozer changed the title [Debugify] Add 'error-test' mode for the debugify report script, for CI [Debugify] Add 'acceptance-test' mode for the debugify report script Jul 17, 2025
@SLTozer SLTozer merged commit b7c14b6 into llvm:main Jul 17, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants