Skip to content

[AutoDiff] Refine debug info emitted for adjoint buffers #62779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 6, 2023

Conversation

asl
Copy link
Contributor

@asl asl commented Dec 26, 2022

Single input variable might yield multiple adjoint buffers if control flow is involved. Therefore we cannot simply transfer debug info from the input variable: it will be invalid as we will end with multiple locations for a single "source" variable, and, even worse, might end with conflicting debug info as different buffers might be optimized differently.

We do:

  • Drop input argument number. This must be unique and we're not
  • Correct variable name

Fixes #62608

@asl asl requested review from rxwei and dan-zheng December 26, 2022 10:40
@asl
Copy link
Contributor Author

asl commented Dec 26, 2022

Tagging @BradLarson

@asl
Copy link
Contributor Author

asl commented Dec 26, 2022

@swift-ci please test

Comment on lines 569 to 570
llvm::raw_svector_ostream(adjName) << "adjoint of '" << dv.Name
<< "' in bb" << origBB->getDebugID();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the variable name shows up in the debugger, it might be better to use the user-facing language to describe these variables: "derivative of" instead of "adjoint of". I'm not sure the " in bb..." part is useful as the user won't make sense of basic blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in bbN part is actually what is making this unique. We need to have separate variable names and there is a dedicated variable per basic block.

I'm not sure if 'derivative' is a correct term here. After all, it's adjoint :) Things are produced only in compiler-generated code (pullback), so we're already exposing the special terms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. "in bbN"

I'm interested in finding a user-digestible name here. Otherwise it seems to be nothing but confusion, as we can't expect users know what a "bb0" or "basic block 0" is.

For adjoint values coming from the current basic block, can we drop "in bbN" completely? This information is not useful to debug derivatives in the same scope.

For other derivatives, can we instead use source location information, e.g. "in the scope at 123:23", where "123:23" is the source location of the first instruction in its parent basic block.

Re. "adjoint of"

Things are produced only in compiler-generated code (pullback), so we're already exposing the special terms.

I disagree. "pullback" is an intentionally user-facing term and is used extensively in documentation and API, e.g. pullback(at:of:), valueWithPullback(at:of:). Thus we are not already exposing compiler-internal terms.

I'm not sure if 'derivative' is a correct term here. After all, it's adjoint :)

I don't intend to debate about "correctness" here as the design already has already made a bunch of tradeoffs (e.g. "tangent" instead of the more correct "cotangent"). However, "adjoint" refers to the adjoint (transpose) of the differential linear map, which is really the pullback function itself, not a value. Thus "adjoint" wouldn't be the correct term to use on a vector in the pullback.

Additionally, "adjoint", exactly like "transpose", is relative. The adjoint of the adjoint of a linear map is the linear map itself. Using "adjoint" without mentioning what it's applied to adds confusion. An adjoint in the context of AD is the adjoint of the differential, which we already have a user-facing term for, which is "pullback".

As such, calling those values "v in the pullback" would be much much more correct and significantly less confusing than "adjoint of v". However it can be argued that a nominal phrase would be better than the prepositional phrase "in the pullback", to name a variable in the debugger.

More on "adjoint": We chose to use "adjoint" in the implementation because most AD implementations did so, but in hindsight we really should've called it "tangent" or "cotangent" to align with the user model, i.e. the tangent vector at a certain point in the program. JVPCloner, the forward-mode implementation, already does so by referring to all values in the differential as "tangent".

Unlike "adjoint" which really refers to a function instead of a value, the word "derivative" on the other hand already has the duality of referring to both derivative functions and the results thereof. Also because it matches what's taught in introductory calculus, the word is the most approachable. The proposal explicitly refers to values in the pullback as derivative values.

[...] a derivative value is a vector in the tangent bundle of this manifold and has type TangentVector.

That's why I'm suggesting "derivative of". "Partial derivative of" or "tangent vector" would be fine too. But shorter is better here for a debug variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, please don't surface the notion of basic blocks to users. Statements do not map one-to-one between basic blocks and vice versa.

@@ -0,0 +1,79 @@
// RUN: %target-swift-frontend -emit-sil -O -g %s | %FileCheck %s

// REQUIRES: swift_in_compiler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this require swift_in_compiler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is optimized on Windows differently due to lack of optimization passes written in Swift....

@asl
Copy link
Contributor Author

asl commented Dec 27, 2022

@swift-ci please test

@asl
Copy link
Contributor Author

asl commented Jan 3, 2023

@rxwei Thanks for the comments!

For other derivatives, can we instead use source location information, e.g. "in the scope at 123:23", where "123:23" is the source location of the first instruction in its parent basic block.

So, the problem here is that we need to have unique locations. Are we sure that scopes for different basic blocks will be different even after optimizations?

@rxwei
Copy link
Contributor

rxwei commented Jan 3, 2023

@rxwei Thanks for the comments!

For other derivatives, can we instead use source location information, e.g. "in the scope at 123:23", where "123:23" is the source location of the first instruction in its parent basic block.

So, the problem here is that we need to have unique locations. Are we sure that scopes for different basic blocks will be different even after optimizations?

We could use something vague but non-confusing, e.g. derivative of v in scope at 123:23 (scope #1) where #1 is the BB number.

@rxwei
Copy link
Contributor

rxwei commented Jan 3, 2023

@adrian-prantl for input on this approach

asl added 3 commits January 5, 2023 12:48
Single input variable might yield multiple adjoint buffers if
control flow is involved. Therefore we cannot simply transfer debug info
from the input variable: it will be invalid as we will end with multiple
locations for a single "source" variable, and, even worse, might end
with conflicting debug info as different buffers might be optimized
differently.

We do:
  - Drop input argument number. This must be unique and we're not
  - Correct variable name
@asl asl force-pushed the 58660-adj-debug-info branch from 063baae to f6657a1 Compare January 5, 2023 20:57
@asl
Copy link
Contributor Author

asl commented Jan 5, 2023

@swift-ci please test

@asl asl requested a review from rxwei January 5, 2023 20:58
Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on it!

I do wonder what this will look like when the body is expanded from a macro...

@asl
Copy link
Contributor Author

asl commented Jan 6, 2023

@swift-ci please test

1 similar comment
@asl
Copy link
Contributor Author

asl commented Jan 6, 2023

@swift-ci please test

@asl asl merged commit 6d657c9 into main Jan 6, 2023
@asl asl deleted the 58660-adj-debug-info branch January 6, 2023 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AutoDiff] Conflicting debug info for argument with custom valueWithPullback
3 participants