-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[AutoDiff] Refine debug info emitted for adjoint buffers #62779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging @BradLarson |
@swift-ci please test |
llvm::raw_svector_ostream(adjName) << "adjoint of '" << dv.Name | ||
<< "' in bb" << origBB->getDebugID(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the variable name shows up in the debugger, it might be better to use the user-facing language to describe these variables: "derivative of" instead of "adjoint of". I'm not sure the " in bb..." part is useful as the user won't make sense of basic blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The in bbN
part is actually what is making this unique. We need to have separate variable names and there is a dedicated variable per basic block.
I'm not sure if 'derivative' is a correct term here. After all, it's adjoint :) Things are produced only in compiler-generated code (pullback), so we're already exposing the special terms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. "in bbN"
I'm interested in finding a user-digestible name here. Otherwise it seems to be nothing but confusion, as we can't expect users know what a "bb0" or "basic block 0" is.
For adjoint values coming from the current basic block, can we drop "in bbN" completely? This information is not useful to debug derivatives in the same scope.
For other derivatives, can we instead use source location information, e.g. "in the scope at 123:23", where "123:23" is the source location of the first instruction in its parent basic block.
Re. "adjoint of"
Things are produced only in compiler-generated code (pullback), so we're already exposing the special terms.
I disagree. "pullback" is an intentionally user-facing term and is used extensively in documentation and API, e.g. pullback(at:of:)
, valueWithPullback(at:of:)
. Thus we are not already exposing compiler-internal terms.
I'm not sure if 'derivative' is a correct term here. After all, it's adjoint :)
I don't intend to debate about "correctness" here as the design already has already made a bunch of tradeoffs (e.g. "tangent" instead of the more correct "cotangent"). However, "adjoint" refers to the adjoint (transpose) of the differential linear map, which is really the pullback function itself, not a value. Thus "adjoint" wouldn't be the correct term to use on a vector in the pullback.
Additionally, "adjoint", exactly like "transpose", is relative. The adjoint of the adjoint of a linear map is the linear map itself. Using "adjoint" without mentioning what it's applied to adds confusion. An adjoint in the context of AD is the adjoint of the differential, which we already have a user-facing term for, which is "pullback".
As such, calling those values "v in the pullback" would be much much more correct and significantly less confusing than "adjoint of v". However it can be argued that a nominal phrase would be better than the prepositional phrase "in the pullback", to name a variable in the debugger.
More on "adjoint": We chose to use "adjoint" in the implementation because most AD implementations did so, but in hindsight we really should've called it "tangent" or "cotangent" to align with the user model, i.e. the tangent vector at a certain point in the program. JVPCloner
, the forward-mode implementation, already does so by referring to all values in the differential as "tangent".
Unlike "adjoint" which really refers to a function instead of a value, the word "derivative" on the other hand already has the duality of referring to both derivative functions and the results thereof. Also because it matches what's taught in introductory calculus, the word is the most approachable. The proposal explicitly refers to values in the pullback as derivative values.
[...] a derivative value is a vector in the tangent bundle of this manifold and has type
TangentVector
.
That's why I'm suggesting "derivative of". "Partial derivative of" or "tangent vector" would be fine too. But shorter is better here for a debug variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, please don't surface the notion of basic blocks to users. Statements do not map one-to-one between basic blocks and vice versa.
@@ -0,0 +1,79 @@ | |||
// RUN: %target-swift-frontend -emit-sil -O -g %s | %FileCheck %s | |||
|
|||
// REQUIRES: swift_in_compiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this require swift_in_compiler
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is optimized on Windows differently due to lack of optimization passes written in Swift....
@swift-ci please test |
@rxwei Thanks for the comments!
So, the problem here is that we need to have unique locations. Are we sure that scopes for different basic blocks will be different even after optimizations? |
We could use something vague but non-confusing, e.g. |
@adrian-prantl for input on this approach |
Single input variable might yield multiple adjoint buffers if control flow is involved. Therefore we cannot simply transfer debug info from the input variable: it will be invalid as we will end with multiple locations for a single "source" variable, and, even worse, might end with conflicting debug info as different buffers might be optimized differently. We do: - Drop input argument number. This must be unique and we're not - Correct variable name
063baae
to
f6657a1
Compare
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating on it!
I do wonder what this will look like when the body is expanded from a macro...
…lts after optimizations
@swift-ci please test |
1 similar comment
@swift-ci please test |
Single input variable might yield multiple adjoint buffers if control flow is involved. Therefore we cannot simply transfer debug info from the input variable: it will be invalid as we will end with multiple locations for a single "source" variable, and, even worse, might end with conflicting debug info as different buffers might be optimized differently.
We do:
Fixes #62608