[AutoDiff] Refine debug info emitted for adjoint buffers #62779

asl · 2022-12-26T10:40:33Z

Single input variable might yield multiple adjoint buffers if control flow is involved. Therefore we cannot simply transfer debug info from the input variable: it will be invalid as we will end with multiple locations for a single "source" variable, and, even worse, might end with conflicting debug info as different buffers might be optimized differently.

We do:

Drop input argument number. This must be unique and we're not
Correct variable name

Fixes #62608

asl · 2022-12-26T10:40:47Z

Tagging @BradLarson

asl · 2022-12-26T10:41:05Z

@swift-ci please test

rxwei · 2022-12-27T05:57:45Z

lib/SILOptimizer/Differentiation/PullbackCloner.cpp

+            llvm::raw_svector_ostream(adjName) << "adjoint of '" << dv.Name
+                                               << "' in bb" << origBB->getDebugID();


Since the variable name shows up in the debugger, it might be better to use the user-facing language to describe these variables: "derivative of" instead of "adjoint of". I'm not sure the " in bb..." part is useful as the user won't make sense of basic blocks.

The in bbN part is actually what is making this unique. We need to have separate variable names and there is a dedicated variable per basic block.

I'm not sure if 'derivative' is a correct term here. After all, it's adjoint :) Things are produced only in compiler-generated code (pullback), so we're already exposing the special terms.

Re. "in bbN"

I'm interested in finding a user-digestible name here. Otherwise it seems to be nothing but confusion, as we can't expect users know what a "bb0" or "basic block 0" is.

For adjoint values coming from the current basic block, can we drop "in bbN" completely? This information is not useful to debug derivatives in the same scope.

For other derivatives, can we instead use source location information, e.g. "in the scope at 123:23", where "123:23" is the source location of the first instruction in its parent basic block.

Re. "adjoint of"

Things are produced only in compiler-generated code (pullback), so we're already exposing the special terms.

I disagree. "pullback" is an intentionally user-facing term and is used extensively in documentation and API, e.g. pullback(at:of:), valueWithPullback(at:of:). Thus we are not already exposing compiler-internal terms.

I'm not sure if 'derivative' is a correct term here. After all, it's adjoint :)

I don't intend to debate about "correctness" here as the design already has already made a bunch of tradeoffs (e.g. "tangent" instead of the more correct "cotangent"). However, "adjoint" refers to the adjoint (transpose) of the differential linear map, which is really the pullback function itself, not a value. Thus "adjoint" wouldn't be the correct term to use on a vector in the pullback.

Additionally, "adjoint", exactly like "transpose", is relative. The adjoint of the adjoint of a linear map is the linear map itself. Using "adjoint" without mentioning what it's applied to adds confusion. An adjoint in the context of AD is the adjoint of the differential, which we already have a user-facing term for, which is "pullback".

As such, calling those values "v in the pullback" would be much much more correct and significantly less confusing than "adjoint of v". However it can be argued that a nominal phrase would be better than the prepositional phrase "in the pullback", to name a variable in the debugger.

More on "adjoint": We chose to use "adjoint" in the implementation because most AD implementations did so, but in hindsight we really should've called it "tangent" or "cotangent" to align with the user model, i.e. the tangent vector at a certain point in the program. JVPCloner, the forward-mode implementation, already does so by referring to all values in the differential as "tangent".

Unlike "adjoint" which really refers to a function instead of a value, the word "derivative" on the other hand already has the duality of referring to both derivative functions and the results thereof. Also because it matches what's taught in introductory calculus, the word is the most approachable. The proposal explicitly refers to values in the pullback as derivative values.

[...] a derivative value is a vector in the tangent bundle of this manifold and has type TangentVector.

That's why I'm suggesting "derivative of". "Partial derivative of" or "tangent vector" would be fine too. But shorter is better here for a debug variable.

+1, please don't surface the notion of basic blocks to users. Statements do not map one-to-one between basic blocks and vice versa.

rxwei · 2022-12-27T05:58:34Z

test/AutoDiff/compiler_crashers_fixed/issue-62608-conflicting-debug-info.swift

@@ -0,0 +1,79 @@
+// RUN: %target-swift-frontend -emit-sil -O -g %s | %FileCheck %s
+
+// REQUIRES: swift_in_compiler


Why does this require swift_in_compiler?

The code is optimized on Windows differently due to lack of optimization passes written in Swift....

asl · 2022-12-27T07:00:29Z

@swift-ci please test

asl · 2023-01-03T22:12:55Z

@rxwei Thanks for the comments!

For other derivatives, can we instead use source location information, e.g. "in the scope at 123:23", where "123:23" is the source location of the first instruction in its parent basic block.

So, the problem here is that we need to have unique locations. Are we sure that scopes for different basic blocks will be different even after optimizations?

rxwei · 2023-01-03T22:37:05Z

@rxwei Thanks for the comments!

For other derivatives, can we instead use source location information, e.g. "in the scope at 123:23", where "123:23" is the source location of the first instruction in its parent basic block.

So, the problem here is that we need to have unique locations. Are we sure that scopes for different basic blocks will be different even after optimizations?

We could use something vague but non-confusing, e.g. derivative of v in scope at 123:23 (scope #1) where #1 is the BB number.

rxwei · 2023-01-03T23:50:02Z

@adrian-prantl for input on this approach

Single input variable might yield multiple adjoint buffers if control flow is involved. Therefore we cannot simply transfer debug info from the input variable: it will be invalid as we will end with multiple locations for a single "source" variable, and, even worse, might end with conflicting debug info as different buffers might be optimized differently. We do: - Drop input argument number. This must be unique and we're not - Correct variable name

asl · 2023-01-05T20:58:14Z

@swift-ci please test

rxwei

Thanks for iterating on it!

I do wonder what this will look like when the body is expanded from a macro...

…lts after optimizations

asl · 2023-01-06T11:50:56Z

@swift-ci please test

asl · 2023-01-06T15:24:27Z

@swift-ci please test

asl requested review from rxwei and dan-zheng December 26, 2022 10:40

rxwei reviewed Dec 27, 2022

View reviewed changes

asl added 3 commits January 5, 2023 12:48

Fix windows deficiencies in optimizations

ce4acbd

Use human-readable scopes for adjoint buffer names

f6657a1

asl force-pushed the 58660-adj-debug-info branch from 063baae to f6657a1 Compare January 5, 2023 20:57

asl requested a review from rxwei January 5, 2023 20:58

rxwei approved these changes Jan 5, 2023

View reviewed changes

Another attempt to pacify windows bots as they produce different resu…

37bf335

…lts after optimizations

asl merged commit 6d657c9 into main Jan 6, 2023

asl deleted the 58660-adj-debug-info branch January 6, 2023 22:34

		llvm::raw_svector_ostream(adjName) << "adjoint of '" << dv.Name
		<< "' in bb" << origBB->getDebugID();

		@@ -0,0 +1,79 @@
		// RUN: %target-swift-frontend -emit-sil -O -g %s \| %FileCheck %s

		// REQUIRES: swift_in_compiler

[AutoDiff] Refine debug info emitted for adjoint buffers #62779

[AutoDiff] Refine debug info emitted for adjoint buffers #62779

Uh oh!

Conversation

asl commented Dec 26, 2022

Uh oh!

asl commented Dec 26, 2022

Uh oh!

asl commented Dec 26, 2022

Uh oh!

rxwei Dec 27, 2022

Choose a reason for hiding this comment

Uh oh!

asl Dec 27, 2022

Choose a reason for hiding this comment

Uh oh!

rxwei Dec 30, 2022

Choose a reason for hiding this comment

Re. "in bbN"

Re. "adjoint of"

Uh oh!

CodaFi Jan 4, 2023

Choose a reason for hiding this comment

Uh oh!

rxwei Dec 27, 2022

Choose a reason for hiding this comment

Uh oh!

asl Dec 27, 2022

Choose a reason for hiding this comment

Uh oh!

asl commented Dec 27, 2022

Uh oh!

asl commented Jan 3, 2023

Uh oh!

rxwei commented Jan 3, 2023

Uh oh!

rxwei commented Jan 3, 2023

Uh oh!

asl commented Jan 5, 2023

Uh oh!

rxwei left a comment

Choose a reason for hiding this comment

Uh oh!

asl commented Jan 6, 2023

Uh oh!

asl commented Jan 6, 2023

Uh oh!

Uh oh!