-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[docs][IRPGO]Document two binary formats for instrumentation-based profiles, with a focus on IRPGO. #76105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…when building instrumented binary
… failed when building instrumented binary" This reverts commit e3bfecf.
Please kindly add who might be interested in review, and feedbacks are very welcome, thanks! |
@llvm/pr-subscribers-pgo Author: Mingming Liu (minglotus-6) ChangesA preview of the HTML is in this link.
Full diff: https://github.com/llvm/llvm-project/pull/76105.diff 2 Files Affected:
diff --git a/llvm/docs/PGOProfileFormat.rst b/llvm/docs/PGOProfileFormat.rst
new file mode 100644
index 00000000000000..5602172e147f00
--- /dev/null
+++ b/llvm/docs/PGOProfileFormat.rst
@@ -0,0 +1,387 @@
+=====================
+IRPGO Profile Format
+=====================
+
+.. contents::
+ :local:
+
+
+Overview
+==========
+
+IR-based instrumentation (IRPGO) and its context-sensitive variant (CS-IRPGO)
+inserts `llvm.instrprof.*` `code generator intrinsics <https://llvm.org/docs/LangRef.html#code-generator-intrinsics>`_
+in LLVM IR to generate profiles. This document describes two binary profile
+formats (raw and indexed) used by IR-based instrumentation.
+
+.. note::
+
+ Both the compiler-rt profiling infrastructure and profile format are general
+ and could support other use cases (e.g., coverage and temporal profiling).
+ This document will focus on IRPGO while briefly introducing other use cases
+ with pointers.
+
+Raw PGO Profile Format
+========================
+
+The raw PGO profile is generated by running the instrumented binary. It is a
+memory dump of the profile data.
+
+Two kinds of frequently used profile information are function's basic block
+counters and its (various flavors of) value profiles. A function's profiled
+information span across several sections in the profile.
+
+General Storage Layout
+-----------------------
+
+A raw profile for an executable [1]_ consists of a profile header and several
+sections. The storage layout is illustrated below. Generally, when raw profile
+is read into an memory buffer, the actual byte offset of a section is inferred
+from the section's order in the layout and size information of all sections
+ahead of it.
+
+::
+
+ +----+-----------------------+
+ | | Magic |
+ | +-----------------------+
+ | | Version |
+ | +-----------------------+
+ H | Size Info for |
+ E | Section 1 |
+ A +-----------------------+
+ D | Size Info for |
+ E | Section 2 |
+ R +-----------------------+
+ | | ... |
+ | +-----------------------+
+ | | Size Info for |
+ | | Section N |
+ +----+-----------------------+
+ P | Section 1 |
+ A +-----------------------+
+ Y | Section 2 |
+ L +-----------------------+
+ O | ... |
+ A +-----------------------+
+ D | Section N |
+ +----+-----------------------+
+
+
+.. note::
+ Sections might be padded to meet platform-specific alignment requirements.
+ For simplicity, header fields and data sections solely for padding purpose
+ are omitted in the data layout graph above and the rest of this document.
+
+Header
+-------
+
+``Magic``
+ With the magic number, data consumer could detect profile format and
+ endianness of the data, and quickly tells whether/how to continue reading.
+
+``Version``
+ The lower 32 bits specifies the actual version and the most significant 32
+ bits specify the variant types of the profile. IRPGO and CS-IRPGO are two
+ variant types.
+
+``BinaryIdsSize``
+ The byte size of binary id section.
+
+``NumData``
+ The number of per-function profile data control structures. The byte size of
+ profile data section could be computed with this field.
+
+``NumCounter``
+ The number of entries in the profile counter section. The byte size of counter
+ section could be computed with this field.
+
+``NumBitmapBytes``
+ The number of bytes in the profile bitmap section.
+
+``NamesSize``
+ The number of bytes in the name section.
+
+``CountersDelta``
+ Records the in-memory address difference between the data and counter section,
+ i.e., `start(__llvm_prf_cnts) - start(__llvm_prf_data)`. It's used jointly
+ with the in-memory address difference of profile data record and its counter
+ to find the counter of a profile data record. Check out calculation-of-counter-offset_
+ for details.
+
+``BitmapDelta``
+ Records the in-memory address difference between the data and bitmap section,
+ i.e., `start(__llvm_prf_bits) - start(__llvm_prf_data)`. It's used jointly
+ with the in-memory address difference of a profile data record and its bitmap
+ to find the bitmap of a profile data record, in a similar to how counters are
+ referenced as explained by calculation-of-counter-offset_ .
+
+``NamesDelta``
+ Records the in-memory address of compressed name section. Not used except for
+ raw profile reader error checking.
+
+``ValueKindLast``
+ Records the number of value kinds. As of writing, two kinds of value profiles
+ are supported. `IndirectCallTarget` is to profile the frequent callees of
+ indirect call instructions and `MemOPSize` is for memory intrinsic function
+ size profiling.
+
+ The number of value kinds affects the byte size of per function profile data
+ control structure.
+
+Payload Sections
+------------------
+
+Binary Ids
+^^^^^^^^^^^
+Stores the binary ids of the instrumented binaries to associate binaries with
+profiles for source code coverage. See `Binary Id RFC`_ for introduction.
+
+.. _`Binary Id RFC`: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html
+
+Profile Data
+^^^^^^^^^^^^^
+
+This section stores per-function profile data control structure. The in-memory
+representation of the control structure is `__llvm_profile_data` and the fields
+are defined by `INSTRPROFDATA` macro. Some fields are used to reference data
+from other sections in the profile. The fields are documented as follows:
+
+``NameRef``
+ The MD5 of the function's IRPGO name. IRPGO name has the format
+ `[<filepath>;]<linkage-name>` where `<filepath>;` is provided for local-linkage
+ functions to tell possibly identical function names.
+
+``FuncHash``
+ A fingerprint of the function's control flow graph.
+
+``CounterPtr``
+ The in-memory address difference between profile data and its corresponding counters.
+
+``BitmapPtr``
+ The in-memory address difference between profile data and its bitmap.
+
+``FunctionPointer``
+ Records the function address when instrumented binary runs. This is used to
+ map the profiled callee address of indirect calls to the `NameRef` during
+ conversion from raw to indexed profiles.
+
+``Values``
+ Represents value profiles in a two dimensional array. The number of elements
+ in the first dimension is the number of instrumented value sites across all
+ kinds. Each element in the first dimension is the head of a linked list, and
+ the each element in the second dimension is linked list element, carrying
+ `<profiled-value, count>` as payload. This is used by compiler runtime when
+ writing out value profiles.
+
+``NumCounters``
+ The number of counters for the instrumented function.
+
+``NumValueSites``
+ This is an array of counters, and each counter represents the number of
+ instrumented sites for a kind of value in the function.
+
+``NumBitmapBytes``
+ The number of bitmap bytes for the function.
+
+Profile Counters
+^^^^^^^^^^^^^^^^^
+
+For IRPGO [2]_, the counters within an instrumented function are stored contiguously
+and in an order that is consistent with basic block selection in the instrumentation
+pass.
+
+.. _calculation-of-counter-offset:
+
+So how are function counters associated with a function?
+
+Basically, the profile reader iterates per-function control structure (from the
+profile data section) and makes use of the recorded relative distances, as
+illustrated below.
+
+::
+
+ + --> start(__llvm_prf_data) --> +---------------------+ ------------+
+ | | Data 1 | |
+ | +---------------------+ =====|| |
+ | | Data 2 | || |
+ | +---------------------+ || |
+ | | ... | || |
+ Counter| +---------------------+ || |
+ Delta | | Data N | || |
+ | +---------------------+ || | CounterPtr1
+ | || |
+ | CounterPtr2 || |
+ | || |
+ | || |
+ + --> start(__llvm_prf_cnts) --> +---------------------+ || |
+ | ... | || |
+ +---------------------+ -----||----+
+ | Counter 1 | ||
+ +---------------------+ ||
+ | ... | ||
+ +---------------------+ =====||
+ | Counter 2 |
+ +---------------------+
+ | ... |
+ +---------------------+
+ | Counter N |
+ +---------------------+
+
+
+In the graph,
+
+* The profile header records `CounterDelta` with the value as `start(__llvm_prf_cnts) - start(__llvm_prf_data)`.
+ We will call it `CounterDeltaInitVal` below for convenience.
+* For each profile data record, `CounterPtrN` is recorded as `start(Counter) - start(ProfileData)`.
+
+Each time the reader advances to the next data record, it updates `CounterDelta` to minus the size of one `ProfileData`.
+
+For the counter corresponding to the first data record, the byte offset
+relative to the start of the counter section is calculated as `CounterPtr1 - CounterDeltaInitVal`.
+When profile reader advances to the second data record, note `CounterDelta` is now `CounterDeltaInitVal - sizeof(ProfileData)`.
+Thus the byte offset relative to the start of the counter section is calculated as `CounterPtr2 - (CounterDeltaInitVal - sizeof(ProfileData))`.
+
+Bitmap
+^^^^^^^
+This section is used for source-based MC/DC code coverage. Check out `Bitmap RFC`_
+if interested.
+
+.. _`Bitmap RFC`: https://discourse.llvm.org/t/rfc-source-based-mc-dc-code-coverage/59244
+
+Names
+^^^^^^
+
+This section contains the concatenated string of function IRPGO names. If
+compressed, zlib compression algorithm is used.
+
+Function names serve as keys in the PGO data hash table when raw profiles are
+converted into indexed profiles. They are also crucial for `llvm-profdata` to
+show the profiles in a human-readable way.
+
+Value Profile Data
+^^^^^^^^^^^^^^^^^^^^
+
+This section contains the profile data for value profiling.
+
+The value profiles corresponding to a profile data are serialized contiguously
+as one record, and value profile records are stored in the same order as the
+respective profile data, such that a raw profile reader advances the pointer to
+profile data and the pointer to value profile records simutaneously [3]_ to find
+value profiles for a per function, per cfg fingerprint profile data.
+
+Indexed PGO Profile Format
+===========================
+
+General Storage Layout
+-----------------------
+
+::
+
+ +-----------------------+---+
+ | Magic | |
+ +-----------------------+ |
+ | Version | |
+ +-----------------------+ |
+ | HashType | H
+ +-----------------------+ E
+ +-------| HashOffset | A
+ | +-----------------------+ D
+ +-----------| MemProfOffset | E
+ | | +-----------------------+ R
+ | | | BinaryIdOffset | |
+ | | +-----------------------+ |
+ +---------------| TemporalProf- | |
+ | | | | TracesOffset | |
+ | | | +-----------------------+---+
+ | | | | Profile Summary | |
+ | | | +-----------------------+ P
+ | | +------>| Function PGO data | A
+ | | +-----------------------+ Y
+ | +---------- | MemProf profile data | L
+ | +-----------------------+ O
+ | | Binary Ids | A
+ | +-----------------------+ D
+ +-------------->| Temporal profiles | |
+ +-----------------------+---+
+
+Header
+--------
+
+``Magic``
+ The purpose of the magic number is to be able to quickly tell if the profile
+ is an indexed profile.
+
+``Version``
+ Similar to raw profile version, the lower 32 bits specifies the version of the
+ indexed profile and the most significant 32 bits are reserved to specify the
+ variant types of the profile.
+
+``HashType``
+ The hashing scheme for on-disk hash table keys. Only MD5 hashing is used as of
+ writing.
+
+``HashOffset``
+ An on-disk hash table stores the per-function profile records.
+ Precisely speaking, `HashOffset` records the offset of this hash table's
+ metadata (i.e., the number of buckets and entries), which follows right after
+ the payload of the entire hash table.
+
+``MemProfOffset``
+ Records the byte offset of MemProf profiling data.
+
+``BinaryIdOffset``
+ Records the byte offset of binary id sections.
+
+``TemporalProfTracesOffset``
+ Records the byte offset of temporal profiles.
+
+Payload Sections
+------------------
+
+(CS) Profile Summary
+^^^^^^^^^^^^^^^^^^^^^
+This section is right after profile header. It stores the serialized profile
+summary. For context-sensitive IRPGO, this section stores an additional profile
+summary corresponding to the context-sensitive profiles.
+
+Function PGO data
+^^^^^^^^^^^^^^^^^^
+This section stores functions and their PGO profiling data as an on-disk hash
+table. The key of a hash table entry is function's PGO name, and the in-memory
+representation of value is a map. The key of this map is CFG hash, and the value
+is C++ struct `llvm::InstrProfRecord`. The C++ struct collects the profiling
+information like counters and value profiles.
+
+MemProf Profile data
+^^^^^^^^^^^^^^^^^^^^^^
+This section stores function's memory profiling data. See
+`MemProf binary serialization format RFC`_ for the design.
+
+.. _`MemProf binary serialization format RFC`: https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html
+
+Binary Ids
+^^^^^^^^^^^^^^^^^^^^^^
+The section to carry on binary-id information from raw profiles.
+
+Temporal Profile Traces
+^^^^^^^^^^^^^^^^^^^^^^^^
+The section to carry on temporal profile information from raw profiles.
+See `Temporal profiling RFC`_ for an overview.
+
+.. _`Temporal profiling RFC`: https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068
+
+Profile Data Usage
+=======================================
+
+`llvm-profdata` is the command line tool to display and process profile data.
+For supported usages, check out its `documentation <https://llvm.org/docs/CommandGuide/llvm-profdata.html>`_.
+
+
+.. [1] A raw profile file could contain multiple raw profiles. Raw profile
+ reader could parse all raw profiles from the file correctly.
+.. [2] The counter section is used by a few variant types (like coverage and
+ temporal profiling) and might have different semantics there.
+.. [3] The step size of data pointer is the `sizeof(ProfileData)`, and the step
+ size of value profile pointer is calcuated based on the number of collected
+ values.
diff --git a/llvm/docs/UserGuides.rst b/llvm/docs/UserGuides.rst
index 006df613bc5e7d..14a2e161ea54cb 100644
--- a/llvm/docs/UserGuides.rst
+++ b/llvm/docs/UserGuides.rst
@@ -58,6 +58,7 @@ intermediate LLVM representation.
NVPTXUsage
Phabricator
Passes
+ PGOProfileFormat
ReportingGuide
ResponseGuide
Remarks
@@ -177,6 +178,9 @@ Optimizations
referencing, to determine variable locations for debug info in the final
stages of compilation.
+:doc:`PGOProfileFormat`
+ This document explains two binary formats of IRPGO profiles.
+
Code Generation
---------------
|
llvm/docs/PGOProfileFormat.rst
Outdated
@@ -0,0 +1,387 @@ | |||
===================== | |||
IRPGO Profile Format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IRPGO --> Instrumentation PGO. Note that Frontend PGO uses the same format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
Overview | ||
========== | ||
|
||
IR-based instrumentation (IRPGO) and its context-sensitive variant (CS-IRPGO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instrumentation PGO (both IR based and Frontend based).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and removed IRPGO only terms like (LLVM IR, basic block counters) from the doc.
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
.. note:: | ||
|
||
Both the compiler-rt profiling infrastructure and profile format are general |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coverage test uses (frontend) PGO instrumentation and coverage mapping. The format for coverageMap is not included in this document. Similarly the temporal profiling is not covered here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded based on my understanding that "frontend PGO instrumentation profiles have two use cases, PGO and source coverage" and the input that coverage mapping has its own format. PTAL.
llvm/docs/PGOProfileFormat.rst
Outdated
The raw PGO profile is generated by running the instrumented binary. It is a | ||
memory dump of the profile data. | ||
|
||
Two kinds of frequently used profile information are function's basic block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The instrumented binary currently collects two kinds of profile data: ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
memory dump of the profile data. | ||
|
||
Two kinds of frequently used profile information are function's basic block | ||
counters and its (various flavors of) value profiles. A function's profiled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The profile data for a function can span ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to "The profile data for a function span across several sections in the profile", given the control structure and counters are in two sections.
llvm/docs/PGOProfileFormat.rst
Outdated
General Storage Layout | ||
----------------------- | ||
|
||
A raw profile for an executable [1]_ consists of a profile header and several |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also shared libary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
----------------------- | ||
|
||
A raw profile for an executable [1]_ consists of a profile header and several | ||
sections. The storage layout is illustrated below. Generally, when raw profile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when the raw profile ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
``Magic`` | ||
With the magic number, data consumer could detect profile format and | ||
endianness of the data, and quickly tells whether/how to continue reading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove 'quickly'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
referenced as explained by calculation-of-counter-offset_ . | ||
|
||
``NamesDelta`` | ||
Records the in-memory address of compressed name section. Not used except for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be uncompressed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed "compressed" as whether compressed or not is not very important for the documentation of this field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file name should probably be updated to InstrumentationPGOProfileFormat
. Will wait and do this later as one-off to minimize diff..
llvm/docs/PGOProfileFormat.rst
Outdated
@@ -0,0 +1,387 @@ | |||
===================== | |||
IRPGO Profile Format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
Overview | ||
========== | ||
|
||
IR-based instrumentation (IRPGO) and its context-sensitive variant (CS-IRPGO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
Overview | ||
========== | ||
|
||
IR-based instrumentation (IRPGO) and its context-sensitive variant (CS-IRPGO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and removed IRPGO only terms like (LLVM IR, basic block counters) from the doc.
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
.. note:: | ||
|
||
Both the compiler-rt profiling infrastructure and profile format are general |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded based on my understanding that "frontend PGO instrumentation profiles have two use cases, PGO and source coverage" and the input that coverage mapping has its own format. PTAL.
llvm/docs/PGOProfileFormat.rst
Outdated
memory dump of the profile data. | ||
|
||
Two kinds of frequently used profile information are function's basic block | ||
counters and its (various flavors of) value profiles. A function's profiled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to "The profile data for a function span across several sections in the profile", given the control structure and counters are in two sections.
llvm/docs/PGOProfileFormat.rst
Outdated
General Storage Layout | ||
----------------------- | ||
|
||
A raw profile for an executable [1]_ consists of a profile header and several |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
----------------------- | ||
|
||
A raw profile for an executable [1]_ consists of a profile header and several | ||
sections. The storage layout is illustrated below. Generally, when raw profile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
``Magic`` | ||
With the magic number, data consumer could detect profile format and | ||
endianness of the data, and quickly tells whether/how to continue reading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
referenced as explained by calculation-of-counter-offset_ . | ||
|
||
``NamesDelta`` | ||
Records the in-memory address of compressed name section. Not used except for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed "compressed" as whether compressed or not is not very important for the documentation of this field.
llvm/docs/PGOProfileFormat.rst
Outdated
@@ -0,0 +1,395 @@ | |||
=================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps name this ProfileFormat.rst, since so much is shared with pure code coverage applications. It's fine that the doc currently focuses on PGO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps name this ProfileFormat.rst, since so much is shared with pure code coverage applications.
Ack. I wonder if we want to use InstrumentationProfileFormat.rst
since SamplePGO uses different format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InstrProfileFormat.rst sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated filename to InstrProfileFormat.rst
in a standalone local commit
The actual changes should be visible in the commit right before it
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
:: | ||
|
||
+-----------------------+---+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add a comment in the code to update this documentation when the format changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment close to the header definition for both raw and indexed profiles.
Uses link https://llvm.org/docs/InstrProfileFormat.html assuming file name will be InstrProfileFormat.rst
.
llvm/docs/PGOProfileFormat.rst
Outdated
We will call it `CounterDeltaInitVal` below for convenience. | ||
* For each profile data record, `CounterPtrN` is recorded as `start(Counter) - start(ProfileData)`. | ||
|
||
Each time the reader advances to the next data record, it updates `CounterDelta` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a link to the code (at a certain commit)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
.. _`documentation`: https://llvm.org/docs/CoverageMappingFormat.html | ||
|
||
Raw Profile Format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Highlight compatibility guarantees of Raw Profile Format.
Also mention endianness of raw profile data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned version compatibility guarantees for raw and indexed format.
And mention the endianness where Magic
field for raw profile header is documented, since the Magic
field is used by raw profile reader to decide whether to swap bytes.
Relatedly, created #76312 to fix one issue related with endiannness.
llvm/docs/PGOProfileFormat.rst
Outdated
=================== | ||
|
||
The raw profile is generated by running the instrumented binary. It is a memory | ||
dump of the profile data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/profile counters/profile data/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the comment means to say 's/profile data/profile counters'?
Nevertheless, I revised this to The raw profile data from an executable or a shared library consists of a header and multiple sections, with each section as a memory dump. The profile raw data needs to be reasonably compact and fast to generate.
. PTAL.
llvm/docs/PGOProfileFormat.rst
Outdated
identical functions. | ||
|
||
``FuncHash`` | ||
A fingerprint of the function's control flow graph. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This includes CFG plus some more stuff (memory ops I think). Can you put in a link to the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly reworded and added a link to computeCFGHash
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
Bitmap | ||
^^^^^^^ | ||
This section is used for source-based MC/DC code coverage. Check out `Bitmap RFC`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand MC/DC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
Overview | ||
========= | ||
|
||
Instrumentation PGO inserts `llvm.instrprof.*` `code generator intrinsics`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably referencing https://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation for background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/PGOProfileFormat.rst
Outdated
``FuncHash`` | ||
A fingerprint of the function's control flow graph. | ||
|
||
``CounterPtr`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the relative distance (offset) in bytes between the function counter and the start of the counter section>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description is correct. (My comment was based on old implementation before recent changes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the recent binary profile correlation
effort from @ZequanWu , CounterPtr
records the address of counters if I'm reading correctly.
I updated the documentation to point out fields that might have different ways of interpretation. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CounterPtr
is still a relative address (__profc_foo
- __profd_foo
) in default mode. Under binary profile correlation mode, it will just be the absolute address of the counter __profc_foo
.
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
Basically, the profile reader iterates per-function control structure (from the | ||
profile data section) and makes use of the recorded relative distances, as | ||
illustrated below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is clearer to use an equation: CounterOffset(Func) = Data(Func).CounterPtr + Counter_Delta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw some equation below.
llvm/docs/PGOProfileFormat.rst
Outdated
|
||
Function PGO data | ||
^^^^^^^^^^^^^^^^^^ | ||
This section stores functions and their PGO profiling data as an on-disk hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Profile data for functions with the same name are grouped together and share one hash table entry (the functions may come from different shared libraries for instance). The profile data for them are organized as a sequence of key-value pair where the key is the funcHash (CFG based for IR PGO), and the value is profile counters for the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this.
* Some rewording as suggseted. * Add link to code at a specific commit at a few places. * Mention it explicitly when fields might have a different semantics in non-IRPGO case. * Mention version compatibility guarantees explicitly for both formats, and add more details on endianness handling for raw profiles. * Add code comment to ask for doc update if appropriate.
llvm/docs/InstrProfileFormat.rst
Outdated
Some fields are used to reference data from other sections in the profile. | ||
The fields are documented as follows: | ||
|
||
.. _`__llvm_profile_data`: https://github.com/llvm/llvm-project/blob/7c3b67d2038cfb48a80299089f6a1308eee1df7f/compiler-rt/lib/profile/InstrProfiling.h#L25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm-project/compiler-rt/include/profile/InstrProfData.inc
Lines 65 to 95 in 7c3b67d
/* INSTR_PROF_DATA start. */ | |
/* Definition of member fields of the per-function control structure. */ | |
#ifndef INSTR_PROF_DATA | |
#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) | |
#else | |
#define INSTR_PROF_DATA_DEFINED | |
#endif | |
INSTR_PROF_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), NameRef, \ | |
ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \ | |
IndexedInstrProf::ComputeHash(getPGOFuncNameVarInitializer(Inc->getName())))) | |
INSTR_PROF_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), FuncHash, \ | |
ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \ | |
Inc->getHash()->getZExtValue())) | |
INSTR_PROF_DATA(const IntPtrT, IntPtrTy, CounterPtr, RelativeCounterPtr) | |
INSTR_PROF_DATA(const IntPtrT, IntPtrTy, BitmapPtr, RelativeBitmapPtr) | |
/* This is used to map function pointers for the indirect call targets to | |
* function name hashes during the conversion from raw to merged profile | |
* data. | |
*/ | |
INSTR_PROF_DATA(const IntPtrT, llvm::PointerType::getUnqual(Ctx), FunctionPointer, \ | |
FunctionAddr) | |
INSTR_PROF_DATA(IntPtrT, llvm::PointerType::getUnqual(Ctx), Values, \ | |
ValuesPtrExpr) | |
INSTR_PROF_DATA(const uint32_t, llvm::Type::getInt32Ty(Ctx), NumCounters, \ | |
ConstantInt::get(llvm::Type::getInt32Ty(Ctx), NumCounters)) | |
INSTR_PROF_DATA(const uint16_t, Int16ArrayTy, NumValueSites[IPVK_Last+1], \ | |
ConstantArray::get(Int16ArrayTy, Int16ArrayVals)) \ | |
INSTR_PROF_DATA(const uint32_t, llvm::Type::getInt32Ty(Ctx), NumBitmapBytes, \ | |
ConstantInt::get(llvm::Type::getInt32Ty(Ctx), NumBitmapBytes)) | |
#undef INSTR_PROF_DATA | |
/* INSTR_PROF_DATA end. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the link, thanks!
llvm/docs/InstrProfileFormat.rst
Outdated
+ --> start(__llvm_prf_cnts) --> +---------------------+ || | | ||
| ... | || | | ||
+---------------------+ -----||----+ | ||
| Counter 1 | || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For "Counter 1", I think you mean the array of counters for data 1. Maybe rename to "Counters for Data 1" for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
.. _calculation-of-counter-offset: | ||
|
||
As mentioned above, the recorded counter offset is relative to the profile metadata. | ||
So how are function counters associated with the profiled function? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps change the question to : "how are function counters located in the raw profile data"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
|
||
* The profile header records `CounterDelta` with the value as `start(__llvm_prf_cnts) - start(__llvm_prf_data)`. | ||
We will call it `CounterDeltaInitVal` below for convenience. | ||
* For each profile data record, `CounterPtrN` is recorded as `start(Counter) - start(ProfileData)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counter --> CounterN, ProfileData --> ProfileDataN.
Also describe that DataN is the N th entry in __llvm_prf_data, and CounterN is the corresponding profile counters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
thanks for all the reviews! I'll wait a little bit more for review comments, and plan to submit it in early Wednesday if the pull request looks good. p.s. I'm likely late in the party.. just realized Github UI displays rich diff (like https://github.com/llvm/llvm-project/pull/76105/files?short_path=b1805ae#diff-b1805ae3bd5b5cf0c69249b7df329ef8dd7cbfe322ae004c61f4bd507d2a87e6) with a click for review, which means the workaround is not necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with some minor nits.
llvm/docs/InstrProfileFormat.rst
Outdated
|
||
* The profile header records `CounterDelta` with the value as `start(__llvm_prf_cnts) - start(__llvm_prf_data)`. | ||
We will call it `CounterDeltaInitVal` below for convenience. | ||
* For each profile data record `ProileDataN`, `CounterPtr` is recorded as `start(CounterN) - start(ProfileDataN)`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: ProfileDataN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the catch. Fixed it.
llvm/docs/InstrProfileFormat.rst
Outdated
=========================== | ||
|
||
Indexed profiles are generated from `llvm-profdata`. In the indexed profiles, | ||
function PGO data are organized as on-disk hash table such that compilers could |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/could/can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
function PGO data are organized as on-disk hash table such that compilers could | ||
look up PGO data for functions in an IR module. | ||
|
||
Compilers and tools must retain backward compatibility with indexed PGO profiles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spell it out? i.e. "older profiles must be readable by newer tools" or something like that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
|
||
Binary Ids | ||
^^^^^^^^^^^^^^^^^^^^^^ | ||
The section is used to carry on binary-id information from raw profiles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a link to the RFC: https://discourse.llvm.org/t/rfc-adding-binary-id-into-llvm-profiles/58465
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
1. Use <mangled-name> in IRPGO name format given the recent fix in pull request 76994. 2. In UserGuides.html, use 'instrumentation-based profiles' (not IRPGO profiles) to keep consistent with filename.
@@ -123,6 +123,8 @@ INSTR_PROF_VALUE_NODE(PtrToNodeT, llvm::PointerType::getUnqual(Ctx), Next, \ | |||
|
|||
/* INSTR_PROF_RAW_HEADER start */ | |||
/* Definition of member fields of the raw profile header data structure. */ | |||
/* Please update https://llvm.org/docs/InstrProfileFormat.html as appropriate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm/docs/InstrProfileFormat.rst
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
|
||
The raw profile is generated by running the instrumented binary. The raw profile | ||
data from an executable or a shared library [3]_ consists of a header and | ||
multiple sections, with each section as a memory dump. The profile raw data needs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raw profile data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
.. _`require`: https://github.com/llvm/llvm-project/blob/bffdde8b8e5d9a76a47949cd0f574f3ce656e181/llvm/lib/ProfileData/InstrProfReader.cpp#L551-L558 | ||
|
||
To feed profiles back into compilers for an optimized build (e.g., via | ||
`-fprofile-use` for IR instrumentation), a raw profile must to be converted into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double backticks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
|
||
``CountersDelta`` | ||
This field records the in-memory address difference between the `profile metadata`_ | ||
and counter section in the instrumented binary, i.e., `start(__llvm_prf_cnts) - start(__llvm_prf_data)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double backticks
Consider a regex that searches for a single backtick that is not followed by a _
. There is a high probability that double backticks should be used instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
and counter section in the instrumented binary, i.e., `start(__llvm_prf_cnts) - start(__llvm_prf_data)`. | ||
|
||
It's used jointly with the `CounterPtr`_ field to compute the counter offset | ||
relative to `start(__llvm_prf_cnts)`. Check out calculation-of-counter-offset_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double backticks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
for a visualized explanation. | ||
|
||
.. note:: | ||
Instrumentations might not load the `__llvm_prf_data` object file section |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double backticks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, and slightly reworded the sentence.
llvm/docs/InstrProfileFormat.rst
Outdated
|
||
``BitmapDelta`` | ||
This field records the in-memory address difference between the `profile metadata`_ | ||
and bitmap section in the instrumented binary, i.e., `start(__llvm_prf_bits) - start(__llvm_prf_data)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double backticks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
|
||
.. note:: | ||
Frontend-generated profiles are used together with coverage mapping for | ||
`source based code coverage`_. The `coverage mapping format`_ is different from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
source-based
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
``Magic`` | ||
Magic number encodes profile format (raw, indexed or text). For the raw format, | ||
the magic number also encodes the endianness (big or little) and C pointer | ||
byte size (32 or 64) of the platform on which the profile is generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"C pointer byte size (32 or 42)" can be misleading, 32/64 bytes?
Consider using C pointer size or just pointer size.
... encodes the endianness (big or little) and the pointer size (4 or 8 bytes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
a platform with the opposite endianness and/or the other C pointer byte size. | ||
|
||
``Version`` | ||
The lower 32 bits specifies the actual version and the most significant 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done (here and below for indexed profile version field)
llvm/docs/InstrProfileFormat.rst
Outdated
|
||
.. _`advances`: https://github.com/llvm/llvm-project/blob/7e15fa9161eda7497a5d6abf0d951a1d12d86550/llvm/include/llvm/ProfileData/InstrProfReader.h#L456-L457 | ||
|
||
Indexed PGO Profile Format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indexed Profile Format
Since this is shared with coverage, omitting "PGO" seems clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done (here and in a couple of places below)
llvm/docs/InstrProfileFormat.rst
Outdated
| | | | +-----------------------+---+ | ||
| | | | | Profile Summary | | | ||
| | | | +-----------------------+ P | ||
| | +------>| Function PGO data | A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Data
(omit "PGO")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/InstrProfileFormat.rst
Outdated
summary. For context-sensitive IR-based instrumentation PGO, this section stores | ||
an additional profile summary corresponding to the context-sensitive profiles. | ||
|
||
Function PGO data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Data
This is shared with coverage, so we probably want to de-emphasize "PGO".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/docs/UserGuides.rst
Outdated
@@ -58,6 +58,7 @@ intermediate LLVM representation. | |||
NVPTXUsage | |||
Phabricator | |||
Passes | |||
InstrProfileFormat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is alphabetically ordered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it.
…ofiles, with a focus on IRPGO. (llvm#76105)
Github review tool renders the rich diff (example)