-
Notifications
You must be signed in to change notification settings - Fork 200
Fix reporting of bad join orders in recursive predicates #4019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The evaluator log summary JSON needs to be read in two passes, to avoid trying to calculate scores for a recursive layer before all the events for it have been processed. Doing it in one pass meant that, for predicates associated with an IN_LAYER event that came after the corresponding COMPUTE_RECURSIVE event, we were missing their delta sizes and were not even attempting to their join-order scores. I had to remove some of the generic machinery for registering log processors, but the JoinOrderScanner was the only processor being registered anyway.
2450cdf
to
84441f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the reporting of bad join orders in recursive predicates by correcting the join order metric calculation and ensuring that problems in recursive layers are correctly reported. Key changes include updating unit tests to support separate non‐recursive and recursive reporting, enhancing the predicate symbol structure with recursion summaries, and refactoring the join order scanner to perform a two‐pass analysis over log files.
Reviewed Changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
extensions/ql-vscode/test/unit-tests/log-scanner.test.ts | Updates test expectations for reporting two problems and adjusting for the new API methods. |
extensions/ql-vscode/src/log-insights/summary-parser.ts | Adds a new recursionSummaries field to predicate symbols. |
extensions/ql-vscode/src/log-insights/performance-comparison.ts | Minor refactoring to align with removed interface requirements. |
extensions/ql-vscode/src/log-insights/log-scanner.ts | Changes API to support distinct reporting for non-recursive and recursive predicates. |
extensions/ql-vscode/src/log-insights/log-scanner-service.ts | Updates the service to use the new scanAndReportJoinOrderProblems function. |
extensions/ql-vscode/src/log-insights/join-order.ts | Major refactoring of join order analysis – splitting scanning into a predicate size collection pass and a join order evaluation pass. |
extensions/ql-vscode/src/extension.ts | Removes the legacy JoinOrderScannerProvider registration. |
extensions/ql-vscode/src/compare-performance/compare-performance-view.ts | Adjusts scanning logic to use the new file reading and progress reporting patterns. |
extensions/ql-vscode/CHANGELOG.md | Documents the fix for bad join orders in recursive predicates. |
Files not reviewed (1)
- extensions/ql-vscode/test/unit-tests/data/evaluator-log-summaries/bad-join-order.jsonl: Language not supported
Previously, the extension was attempting to calculate join-order scores for recursive predicates, but for some reason was not actually reporting those that exceeded the user's configured threshold. The calculation was also incorrect, since it was calculating scores before seeing all the events for a recursive layer.
This PR enables that reporting and fixes the scoring.
Note that it relies on a corresponding change to the CodeQL CLI, to emit location information for recursion summaries in the
evaluator-log.summary.symbols.json
file. If using an older CLI, this information will be missing and the extension will not report bad join orders for recursive predicates. Since this is no worse than the current behaviour, I have decided not to implement a fallback.Reviewing commits individually is mildly recommended.