-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[BOLT][NFC] Register profiled functions once #150622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BOLT][NFC] Register profiled functions once #150622
Conversation
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
@llvm/pr-subscribers-bolt Author: Amir Ayupov (aaupov) ChangesWhile registering profiled functions, only handle each address once. Test Plan: Full diff: https://github.com/llvm/llvm-project/pull/150622.diff 2 Files Affected:
diff --git a/bolt/include/bolt/Profile/DataAggregator.h b/bolt/include/bolt/Profile/DataAggregator.h
index db0f6903185b7..cb1b87f8d0d65 100644
--- a/bolt/include/bolt/Profile/DataAggregator.h
+++ b/bolt/include/bolt/Profile/DataAggregator.h
@@ -502,6 +502,9 @@ class DataAggregator : public DataReader {
/// entries).
void imputeFallThroughs();
+ /// Register profiled functions for lite mode.
+ void registerProfiledFunctions();
+
/// Debugging dump methods
void dump() const;
void dump(const PerfBranchSample &Sample) const;
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index 3604fdd3a94b4..c13fa6dbe582b 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -581,6 +581,26 @@ void DataAggregator::imputeFallThroughs() {
outs() << "BOLT-INFO: imputed " << InferredTraces << " traces\n";
}
+void DataAggregator::registerProfiledFunctions() {
+ DenseSet<uint64_t> Addrs;
+ for (const auto &Trace : llvm::make_first_range(Traces)) {
+ if (Trace.Branch != Trace::FT_ONLY &&
+ Trace.Branch != Trace::FT_EXTERNAL_ORIGIN)
+ Addrs.insert(Trace.Branch);
+ Addrs.insert(Trace.From);
+ }
+
+ for (const auto [PC, _] : BasicSamples)
+ Addrs.insert(PC);
+
+ for (const PerfMemSample &MemSample : MemSamples)
+ Addrs.insert(MemSample.PC);
+
+ for (const uint64_t Addr : Addrs)
+ if (BinaryFunction *Func = getBinaryFunctionContainingAddress(Addr))
+ Func->setHasProfileAvailable();
+}
+
Error DataAggregator::preprocessProfile(BinaryContext &BC) {
this->BC = &BC;
@@ -603,6 +623,7 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) {
exit(0);
}
+ registerProfiledFunctions();
return Error::success();
}
@@ -1347,10 +1368,6 @@ std::error_code DataAggregator::parseAggregatedLBREntry() {
}
const uint64_t FromOffset = Addr[0]->Offset;
- BinaryFunction *FromFunc = getBinaryFunctionContainingAddress(FromOffset);
- if (FromFunc)
- FromFunc->setHasProfileAvailable();
-
int64_t Count = Counters[0];
int64_t Mispreds = Counters[1];
@@ -1361,11 +1378,6 @@ std::error_code DataAggregator::parseAggregatedLBREntry() {
return std::error_code();
}
- const uint64_t ToOffset = Addr[1]->Offset;
- BinaryFunction *ToFunc = getBinaryFunctionContainingAddress(ToOffset);
- if (ToFunc)
- ToFunc->setHasProfileAvailable();
-
/// For fall-through types, adjust locations to match Trace container.
if (Type == FT || Type == FT_EXTERNAL_ORIGIN || Type == FT_EXTERNAL_RETURN) {
Addr[2] = Location(Addr[1]->Offset); // Trace To
@@ -1613,9 +1625,6 @@ std::error_code DataAggregator::parseBranchEvents() {
Traces.reserve(TraceMap.size());
for (const auto &[Trace, Info] : TraceMap) {
Traces.emplace_back(Trace, Info);
- for (const uint64_t Addr : {Trace.Branch, Trace.From})
- if (BinaryFunction *BF = getBinaryFunctionContainingAddress(Addr))
- BF->setHasProfileAvailable();
}
clear(TraceMap);
@@ -1676,9 +1685,6 @@ std::error_code DataAggregator::parseBasicEvents() {
continue;
++NumTotalSamples;
- if (BinaryFunction *BF = getBinaryFunctionContainingAddress(Sample->PC))
- BF->setHasProfileAvailable();
-
++BasicSamples[Sample->PC];
EventNames.insert(Sample->EventName);
}
@@ -1716,9 +1722,6 @@ std::error_code DataAggregator::parseMemEvents() {
if (std::error_code EC = Sample.getError())
return EC;
- if (BinaryFunction *BF = getBinaryFunctionContainingAddress(Sample->PC))
- BF->setHasProfileAvailable();
-
MemSamples.emplace_back(std::move(Sample.get()));
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
Created using spr 1.3.4 [skip ci]
Hey @aaupov, This appears to have broken large bolt-tests: Can you please have a look? |
Summary: We are observing random crashes where addresses provided by the hardware (from LBR) randomly clashes with what LLVM uses in DenseSet as tombstones records (-2 << 12), which causes DenseSet to assert when ingesting this data while running the recently added registerProfiledFunctions().
) In perf2bolt, we are observing sporadic crashes in the recently added registerProfiledFunctions from #150622. Addresses provided by the hardware (from LBR) might be -1, which clashes with what LLVM uses in DenseSet as empty tombstones records. This causes DenseSet to assert with "can't insert empty tombstone into map" when ingesting this data. Revert this change for now to unbreak perf2bolt.
While registering profiled functions, only handle each address once.
Speeds up
DataAggregator::preprocessProfile
.Test Plan:
For intermediate size pre-aggregated profile (10MB), reduces parsing
time from ~0.41s down to ~0.16s.