Skip to content

Commit 8fb8ad6

Browse files
authored
[BOLT] Delta-encode function start addresses in BAT (#76902)
Further reduce the size of BAT section: - large binary: to 12716312 bytes (0.33x original), - medium binary: to 1649472 bytes (0.28x original), - small binary: to 428 bytes (0.30x original). Test Plan: Updated bolt/test/X86/bolt-address-translation.test
1 parent bbe0798 commit 8fb8ad6

File tree

3 files changed

+10
-4
lines changed

3 files changed

+10
-4
lines changed

bolt/docs/BAT.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,9 +64,11 @@ Header:
6464
| `NumFuncs` | ULEB128 | Number of functions in the functions table |
6565

6666
The header is followed by Functions table with `NumFuncs` entries.
67+
Output binary addresses are delta encoded, meaning that only the difference with
68+
the previous output address is stored. Addresses implicitly start at zero.
6769
| Entry | Encoding | Description |
6870
| ------ | ------| ----------- |
69-
| `Address` | ULEB128 | Function address in the output binary |
71+
| `Address` | Delta, ULEB128 | Function address in the output binary |
7072
| `NumEntries` | ULEB128 | Number of address translation entries for a function |
7173

7274
Function header is followed by `NumEntries` pairs of offsets for current

bolt/lib/Profile/BoltAddressTranslation.cpp

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,13 +106,15 @@ void BoltAddressTranslation::write(const BinaryContext &BC, raw_ostream &OS) {
106106
const uint32_t NumFuncs = Maps.size();
107107
encodeULEB128(NumFuncs, OS);
108108
LLVM_DEBUG(dbgs() << "Writing " << NumFuncs << " functions for BAT.\n");
109+
uint64_t PrevAddress = 0;
109110
for (auto &MapEntry : Maps) {
110111
const uint64_t Address = MapEntry.first;
111112
MapTy &Map = MapEntry.second;
112113
const uint32_t NumEntries = Map.size();
113114
LLVM_DEBUG(dbgs() << "Writing " << NumEntries << " entries for 0x"
114115
<< Twine::utohexstr(Address) << ".\n");
115-
encodeULEB128(Address, OS);
116+
encodeULEB128(Address - PrevAddress, OS);
117+
PrevAddress = Address;
116118
encodeULEB128(NumEntries, OS);
117119
uint64_t InOffset = 0, OutOffset = 0;
118120
// Output and Input addresses and delta-encoded
@@ -160,8 +162,10 @@ std::error_code BoltAddressTranslation::parse(StringRef Buf) {
160162
Error Err(Error::success());
161163
const uint32_t NumFunctions = DE.getULEB128(&Offset, &Err);
162164
LLVM_DEBUG(dbgs() << "Parsing " << NumFunctions << " functions\n");
165+
uint64_t PrevAddress = 0;
163166
for (uint32_t I = 0; I < NumFunctions; ++I) {
164-
const uint64_t Address = DE.getULEB128(&Offset, &Err);
167+
const uint64_t Address = PrevAddress + DE.getULEB128(&Offset, &Err);
168+
PrevAddress = Address;
165169
const uint32_t NumEntries = DE.getULEB128(&Offset, &Err);
166170
MapTy Map;
167171

bolt/test/X86/bolt-address-translation.test

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
# CHECK: BOLT: 3 out of 7 functions were overwritten.
3838
# CHECK: BOLT-INFO: Wrote 6 BAT maps
3939
# CHECK: BOLT-INFO: Wrote 3 BAT cold-to-hot entries
40-
# CHECK: BOLT-INFO: BAT section size (bytes): 436
40+
# CHECK: BOLT-INFO: BAT section size (bytes): 428
4141
#
4242
# usqrt mappings (hot part). We match against any key (left side containing
4343
# the bolted binary offsets) because BOLT may change where it puts instructions

0 commit comments

Comments
 (0)