-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[flang] Add flags controlling whether to run the fir alias tags pass #68595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This interface allows (HL)FIR passes to add TBAA information to fir.load and fir.store. If present, these TBAA tags take precedence over those added during CodeGen. We can't reuse mlir::LLVMIR::AliasAnalysisOpInterface because that uses the mlir::LLVMIR namespace so it tries to define methods for fir operations in the wrong namespace. But I did re-use the tbaa tag type to minimise boilerplate code. The new builders are to preserve the old interface without the tbaa tag.
For TBBAA attribute types.
See RFC at https://discourse.llvm.org/t/rfc-propagate-fir-alias-analysis-information-using-tbaa/73755 This pass adds TBAA tags to all accesses to non-pointer/target dummy arguments. These TBAA tags tell LLVM that these accesses cannot alias: allowing better dead code elimination, hoisting out of loops, and vectorization. Each function has its own TBAA tree so that accesses between funtions MayAlias after inlining. I also included code for adding tags for local allocations and for global variables. Enabling all three kinds of tag is known to produce a miscompile and so these are disabled by default. But it isn't much code and I thought it could be interesting to play with these later if one is looking at a benchmark which looks like it would benefit from more alias information. I'm open to removing this code too. TBAA tags are also added separately by TBAABuilder during CodeGen. TBAABuilder has to run during CodeGen because it adds tags to box accesses, many of which are implicit in FIR. This pass cannot (easily) run in CodeGen because fir::AliasAnalysis has difficulty tracing values between blocks, and by the time CodeGen runs, structured control flow has already been lowered. Coming in follow up patches - Change CodeGen/TBAABuilder to use TBAAForest to add tags within the same per-function trees as are used here (delayed to a later patch to make it easier to revert) - Command line argument processing to actually enable the pass
This is important to ensure that tags end up in the same trees that were created in the FIR TBAA pass. If they are in different trees then everything in one tree will be assumed to MayAlias with everything in the other tree. This leads to poor performance. @vzakhari requested that the old (not-per-function) trees are maintained so I left the old test intact.
This is fallout from getting argument names from fir.declare operations instead of fir.bindc attributes.
The ultimate intention is to have this pass enabled by default whenever we are optimizing for speed. But for now, just add the arguments so this can be more easily tested. Previous PR in this series: llvm#68437
ba18888
to
9168aa0
Compare
Does |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Not in this patch while -falias-analysis is kept separate and non-default. But that will be the intention in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for addressing my comments!
The ultimate intention is to have this pass enabled by default whenever we are optimizing for speed. But for now, just add the arguments so this can be more easily tested. PR: #68595
Merged manually with ac0015f |
This does not enable the pass by default
The ultimate intention is to have this pass enabled by default whenever
we are optimizing for speed. But for now, just add the arguments so this
can be more easily tested.
Previous PR in this series: #68437
This PR is only for review of ca6751e (and any later fixups)