-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[LLVM][PassBuilder] Extend the function signature of callback for optimizer pipeline extension point #100953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-backend-amdgpu Author: Shilei Tian (shiltian) ChangesThese callbacks can be invoked in multiple places when building an optimization In this patch, an extra argument is added to indicate its (Thin)LTO stage such Full diff: https://github.com/llvm/llvm-project/pull/100953.diff 7 Files Affected:
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index e765bbf637a66..64f0020a170aa 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -643,7 +643,7 @@ static void addKCFIPass(const Triple &TargetTriple, const LangOptions &LangOpts,
// Ensure we lower KCFI operand bundles with -O0.
PB.registerOptimizerLastEPCallback(
- [&](ModulePassManager &MPM, OptimizationLevel Level) {
+ [&](ModulePassManager &MPM, OptimizationLevel Level, ThinOrFullLTOPhase) {
if (Level == OptimizationLevel::O0 &&
LangOpts.Sanitize.has(SanitizerKind::KCFI))
MPM.addPass(createModuleToFunctionPassAdaptor(KCFIPass()));
@@ -662,8 +662,8 @@ static void addKCFIPass(const Triple &TargetTriple, const LangOptions &LangOpts,
static void addSanitizers(const Triple &TargetTriple,
const CodeGenOptions &CodeGenOpts,
const LangOptions &LangOpts, PassBuilder &PB) {
- auto SanitizersCallback = [&](ModulePassManager &MPM,
- OptimizationLevel Level) {
+ auto SanitizersCallback = [&](ModulePassManager &MPM, OptimizationLevel Level,
+ ThinOrFullLTOPhase) {
if (CodeGenOpts.hasSanitizeCoverage()) {
auto SancovOpts = getSancovOptsFromCGOpts(CodeGenOpts);
MPM.addPass(SanitizerCoveragePass(
@@ -749,7 +749,7 @@ static void addSanitizers(const Triple &TargetTriple,
PB.registerOptimizerEarlyEPCallback(
[SanitizersCallback](ModulePassManager &MPM, OptimizationLevel Level) {
ModulePassManager NewMPM;
- SanitizersCallback(NewMPM, Level);
+ SanitizersCallback(NewMPM, Level, ThinOrFullLTOPhase::None);
if (!NewMPM.isEmpty()) {
// Sanitizers can abandon<GlobalsAA>.
NewMPM.addPass(RequireAnalysisPass<GlobalsAA, llvm::Module>());
@@ -1018,11 +1018,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
// TODO: Consider passing the MemoryProfileOutput to the pass builder via
// the PGOOptions, and set this up there.
if (!CodeGenOpts.MemoryProfileOutput.empty()) {
- PB.registerOptimizerLastEPCallback(
- [](ModulePassManager &MPM, OptimizationLevel Level) {
- MPM.addPass(createModuleToFunctionPassAdaptor(MemProfilerPass()));
- MPM.addPass(ModuleMemProfilerPass());
- });
+ PB.registerOptimizerLastEPCallback([](ModulePassManager &MPM,
+ OptimizationLevel Level,
+ ThinOrFullLTOPhase) {
+ MPM.addPass(createModuleToFunctionPassAdaptor(MemProfilerPass()));
+ MPM.addPass(ModuleMemProfilerPass());
+ });
}
if (CodeGenOpts.FatLTO) {
diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h
index 474a19531ff5d..4c2763404ff05 100644
--- a/llvm/include/llvm/Passes/PassBuilder.h
+++ b/llvm/include/llvm/Passes/PassBuilder.h
@@ -497,7 +497,8 @@ class PassBuilder {
/// This extension point allows adding optimizations at the very end of the
/// function optimization pipeline.
void registerOptimizerLastEPCallback(
- const std::function<void(ModulePassManager &, OptimizationLevel)> &C) {
+ const std::function<void(ModulePassManager &, OptimizationLevel,
+ ThinOrFullLTOPhase)> &C) {
OptimizerLastEPCallbacks.push_back(C);
}
@@ -630,7 +631,8 @@ class PassBuilder {
void invokeOptimizerEarlyEPCallbacks(ModulePassManager &MPM,
OptimizationLevel Level);
void invokeOptimizerLastEPCallbacks(ModulePassManager &MPM,
- OptimizationLevel Level);
+ OptimizationLevel Level,
+ ThinOrFullLTOPhase Phase);
void invokeFullLinkTimeOptimizationEarlyEPCallbacks(ModulePassManager &MPM,
OptimizationLevel Level);
void invokeFullLinkTimeOptimizationLastEPCallbacks(ModulePassManager &MPM,
@@ -755,7 +757,9 @@ class PassBuilder {
// Module callbacks
SmallVector<std::function<void(ModulePassManager &, OptimizationLevel)>, 2>
OptimizerEarlyEPCallbacks;
- SmallVector<std::function<void(ModulePassManager &, OptimizationLevel)>, 2>
+ SmallVector<std::function<void(ModulePassManager &, OptimizationLevel,
+ ThinOrFullLTOPhase)>,
+ 2>
OptimizerLastEPCallbacks;
SmallVector<std::function<void(ModulePassManager &, OptimizationLevel)>, 2>
FullLinkTimeOptimizationEarlyEPCallbacks;
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 757b20dcd6693..5827a485879d3 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -365,9 +365,10 @@ void PassBuilder::invokeOptimizerEarlyEPCallbacks(ModulePassManager &MPM,
C(MPM, Level);
}
void PassBuilder::invokeOptimizerLastEPCallbacks(ModulePassManager &MPM,
- OptimizationLevel Level) {
+ OptimizationLevel Level,
+ ThinOrFullLTOPhase Phase) {
for (auto &C : OptimizerLastEPCallbacks)
- C(MPM, Level);
+ C(MPM, Level, Phase);
}
void PassBuilder::invokeFullLinkTimeOptimizationEarlyEPCallbacks(
ModulePassManager &MPM, OptimizationLevel Level) {
@@ -1524,7 +1525,7 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(OptimizePM),
PTO.EagerlyInvalidateAnalyses));
- invokeOptimizerLastEPCallbacks(MPM, Level);
+ invokeOptimizerLastEPCallbacks(MPM, Level, LTOPhase);
// Split out cold code. Splitting is done late to avoid hiding context from
// other optimizations and inadvertently regressing performance. The tradeoff
@@ -1671,7 +1672,8 @@ PassBuilder::buildThinLTOPreLinkDefaultPipeline(OptimizationLevel Level) {
// optimization is going to be done in PostLink stage, but clang can't add
// callbacks there in case of in-process ThinLTO called by linker.
invokeOptimizerEarlyEPCallbacks(MPM, Level);
- invokeOptimizerLastEPCallbacks(MPM, Level);
+ invokeOptimizerLastEPCallbacks(MPM, Level,
+ ThinOrFullLTOPhase::ThinLTOPreLink);
// Emit annotation remarks.
addAnnotationRemarksPass(MPM);
@@ -2159,7 +2161,7 @@ ModulePassManager PassBuilder::buildO0DefaultPipeline(OptimizationLevel Level,
CoroPM.addPass(GlobalDCEPass());
MPM.addPass(CoroConditionalWrapper(std::move(CoroPM)));
- invokeOptimizerLastEPCallbacks(MPM, Level);
+ invokeOptimizerLastEPCallbacks(MPM, Level, ThinOrFullLTOPhase::None);
if (LTOPreLink)
addRequiredLTOPreLinkPasses(MPM);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index 46cc5f349555a..50aef36724f70 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -287,8 +287,13 @@ class AMDGPUAttributorPass : public PassInfoMixin<AMDGPUAttributorPass> {
private:
TargetMachine &TM;
+ /// Asserts whether we can assume whole program visibility.
+ bool HasWholeProgramVisibility = false;
+
public:
- AMDGPUAttributorPass(TargetMachine &TM) : TM(TM){};
+ AMDGPUAttributorPass(TargetMachine &TM,
+ bool HasWholeProgramVisibility = false)
+ : TM(TM), HasWholeProgramVisibility(HasWholeProgramVisibility) {};
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
};
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
index 51968063e8919..ab98da31b050f 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
@@ -1023,7 +1023,8 @@ static void addPreloadKernArgHint(Function &F, TargetMachine &TM) {
}
}
-static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM) {
+static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM,
+ bool HasWholeProgramVisibility) {
SetVector<Function *> Functions;
for (Function &F : M) {
if (!F.isIntrinsic())
@@ -1041,6 +1042,7 @@ static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM) {
&AAUnderlyingObjects::ID, &AAIndirectCallInfo::ID});
AttributorConfig AC(CGUpdater);
+ AC.IsClosedWorldModule = HasWholeProgramVisibility;
AC.Allowed = &Allowed;
AC.IsModulePass = true;
AC.DefaultInitializeLiveInternals = false;
@@ -1086,7 +1088,7 @@ class AMDGPUAttributorLegacy : public ModulePass {
bool runOnModule(Module &M) override {
AnalysisGetter AG(this);
- return runImpl(M, AG, *TM);
+ return runImpl(M, AG, *TM, /*HasWholeProgramVisibility=*/false);
}
void getAnalysisUsage(AnalysisUsage &AU) const override {
@@ -1107,8 +1109,9 @@ PreservedAnalyses llvm::AMDGPUAttributorPass::run(Module &M,
AnalysisGetter AG(FAM);
// TODO: Probably preserves CFG
- return runImpl(M, AG, TM) ? PreservedAnalyses::none()
- : PreservedAnalyses::all();
+ return runImpl(M, AG, TM, HasWholeProgramVisibility)
+ ? PreservedAnalyses::none()
+ : PreservedAnalyses::all();
}
char AMDGPUAttributorLegacy::ID = 0;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index c8fb68d1c0b0c..50cc2d871d4ec 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -735,12 +735,15 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
});
// FIXME: Why is AMDGPUAttributor not in CGSCC?
- PB.registerOptimizerLastEPCallback(
- [this](ModulePassManager &MPM, OptimizationLevel Level) {
- if (Level != OptimizationLevel::O0) {
- MPM.addPass(AMDGPUAttributorPass(*this));
- }
- });
+ PB.registerOptimizerLastEPCallback([this](ModulePassManager &MPM,
+ OptimizationLevel Level,
+ ThinOrFullLTOPhase Phase) {
+ if (Level != OptimizationLevel::O0) {
+ MPM.addPass(AMDGPUAttributorPass(
+ *this, Phase == ThinOrFullLTOPhase::FullLTOPostLink ||
+ Phase == ThinOrFullLTOPhase::ThinLTOPostLink));
+ }
+ });
PB.registerFullLinkTimeOptimizationLastEPCallback(
[this](ModulePassManager &PM, OptimizationLevel Level) {
diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp
index 374698083763b..522a8c06d83c0 100644
--- a/llvm/tools/opt/NewPMDriver.cpp
+++ b/llvm/tools/opt/NewPMDriver.cpp
@@ -310,7 +310,7 @@ static void registerEPCallbacks(PassBuilder &PB) {
});
if (tryParsePipelineText<ModulePassManager>(PB, OptimizerLastEPPipeline))
PB.registerOptimizerLastEPCallback(
- [&PB](ModulePassManager &PM, OptimizationLevel) {
+ [&PB](ModulePassManager &PM, OptimizationLevel, ThinOrFullLTOPhase) {
ExitOnError Err("Unable to parse OptimizerLastEP pipeline: ");
Err(PB.parsePassPipeline(PM, OptimizerLastEPPipeline));
});
|
I thought I tried this myself. Does it really work, as in, is the mode set properly for full and thin lto? I think I had problems for one of them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine to me in general. The patch stack seems to be messed up though, or at least this seems to contain some unrelated AMDGPU changes.
@jdoerfert Possibly the issue you saw is that this callback just isn't used by the post-link full LTO pipeline at all? It uses invokeFullLinkTimeOptimizationLastEPCallbacks instead. Maybe with this change it would make sense to remove that one and merge it into the main OptimizerLastEPCallback with the appropriate LTOPhase?
The other thing I wonder about is whether this argument should be added to other callbacks as well for consistency.
It has some AMD changes because I'd like to demonstrate how the changes will be used.
Yeah, I was wondering that as well. I'm happy to do the changes but not sure if that's necessary. |
76c0d45
to
f4f6b93
Compare
9c94910
to
ed46483
Compare
2bbf195
to
9b12959
Compare
ed46483
to
9980c1f
Compare
ec07983
to
76e5c51
Compare
9980c1f
to
6e26b39
Compare
76e5c51
to
aa74047
Compare
6e26b39
to
8df9dec
Compare
aa74047
to
52df8e6
Compare
8df9dec
to
913f6a7
Compare
ping if it is preferred to split the AMDGPU related changes to another PR, I can do that. |
52df8e6
to
5da7378
Compare
913f6a7
to
842d222
Compare
5da7378
to
5d7487b
Compare
842d222
to
225eca0
Compare
5d7487b
to
ea3f6bb
Compare
225eca0
to
699c144
Compare
699c144
to
602fcdf
Compare
ping +1 |
This PR will not be needed if we decide to move |
…imizer pipeline extension point These callbacks can be invoked in multiple places when building an optimization pipeline, both in compile time and link time. However, there is no indicator on what pipeline it is currently building. In this patch, an extra argument is added to indicate its (Thin)LTO stage such that the callback can check it if needed. There is no test expected from this, and the benefit of this change will be demonstrated in #66488.
602fcdf
to
0110da5
Compare
Split it, and maybe add the Phase to the rest as well. One commit to just make it consistent and pass the information along.
The problem is that the OptimizerLastEPCallback is not called in FullLTO post link, IIRC. So this will alone not solve our issue. However, we still should tell the callbacks what phase we are in, consistently. |
This PR is no longer needed. We decided to move the |
These callbacks can be invoked in multiple places when building an optimization
pipeline, both in compile time and link time. However, there is no indicator on
what pipeline it is currently building.
In this patch, an extra argument is added to indicate its (Thin)LTO stage such
that the callback can check it if needed. There is no test expected from this.