[AutoDiff][DebugInfo] Properly transfer location info from debug_value into alloc_stack in pullback cloner #42245

asl · 2022-04-07T16:19:25Z

In order to transfer debug info from debug_value to alloc_stack we need to drop op_ref debug expression, as the latter instruction represents the location and not the value itself. Also, fix the debug info verifier as variable type was deduced from alloc_stack / alloc_box improperly: pointer type of instruction itself was used instead of underlying object type.

The issue is only exposed when both optimizations (SIL mem2reg at least) and debug info are enabled

Resolves SR-15849

asl · 2022-04-07T16:20:19Z

Tagging @rxwei

rxwei · 2022-04-07T17:49:01Z

Did you mean SR-15849 instead?

asl · 2022-04-07T17:50:44Z

Oh, yes, sorry!

lib/SILOptimizer/Differentiation/Common.cpp

test/AutoDiff/compiler_crashers_fixed/sr15849-invalid-debug-info.swift

adrian-prantl

LGTM, one last question inline.

asl · 2022-04-08T20:27:50Z

@rxwei @adrian-prantl Will you please trigger ci for me?

BradLarson · 2022-04-08T20:32:19Z

@swift-ci Please test.

rxwei · 2022-04-08T22:01:50Z

@swift-ci please test linux

rxwei · 2022-04-09T20:41:42Z

@swift-ci please test

rxwei · 2022-04-10T04:21:45Z

@swift-ci please test linux

asl · 2022-04-10T10:43:03Z

Weird, now we have difference between Linux and Mac. The SIL is different though:
Mac:

bb0(%0 : $*Model.TangentVector, %1 : $_AD__$s4main6adjust5model10multiplieryAA5ModelVz_SftF_bb3__PB__src_0_wrt_0_1):
  %2 = alloc_stack $Model.TangentVector, var, name "model", argno 1 // users: %55, %54, %52, %48, %47, %45, %6, %107, %93, %66
  %3 = metatype $@thin Model.TangentVector.Type   // user: %5
  // function_ref static Model.TangentVector.zero.getter
  %4 = function_ref @$s4main5ModelV13TangentVectorV4zeroAEvgZ : $@convention(method) (@thin Model.TangentVector.Type) -> Model.TangentVector // user: %5
  %5 = apply %4(%3) : $@convention(method) (@thin Model.TangentVector.Type) -> Model.TangentVector // user: %6
  store %5 to %2 : $*Model.TangentVector          // id: %6
  %7 = tuple ()
  %8 = alloc_stack $Model.TangentVector, var, name "model", argno 1 // users: %41, %12, %106, %105, %39, %37
...

Linux:

bb0(%0 : $*Model.TangentVector, %1 : @owned $_AD__$s4main6adjust5model10multiplieryAA5ModelVz_SftF_bb3__PB__src_0_wrt_0_1):
  %2 = alloc_stack $Builtin.FPIEEE32, var, (name "model", loc "/home/build-user/swift/test/AutoDiff/validation-test/inout_control_flow.swift":43:13, scope 0), argno 1, type $*Model.TangentVector, expr op_fragment:#Float._value // users: %145, %58, %65, %80, %87, %71, %93, %10, %117, %106
  %3 = alloc_stack $Builtin.FPIEEE32, var, (name "model", loc "/home/build-user/swift/test/AutoDiff/validation-test/inout_control_flow.swift":43:13, scope 0), argno 1, type $*Model.TangentVector, expr op_fragment:#Float._value // users: %144, %133, %60, %67, %82, %89, %73, %95, %12, %137
  %4 = metatype $@thin Model.TangentVector.Type   // user: %6
  // function_ref static Model.TangentVector.zero.getter
  %5 = function_ref @$s4main5ModelV13TangentVectorV4zeroAEvgZ : $@convention(method) (@thin Model.TangentVector.Type) -> Model.TangentVector // user: %6
  %6 = apply %5(%4) : $@convention(method) (@thin Model.TangentVector.Type) -> Model.TangentVector // user: %7
  (%7, %8) = destructure_struct %6 : $Model.TangentVector // users: %9, %11
  %9 = destructure_struct %7 : $Float             // user: %10
  store %9 to [trivial] %2 : $*Builtin.FPIEEE32   // id: %10
  %11 = destructure_struct %8 : $Float            // user: %12
  store %11 to [trivial] %3 : $*Builtin.FPIEEE32  // id: %12
  %13 = tuple ()
  %14 = alloc_stack $Model.TangentVector, var, name "model", argno 1 // users: %49, %46, %51, %18, %143

Note that on Linux we allocate two separate stack slots for members of $Model.TangentVector and, more important, provide the pointer type as type for variable: type $*Model.TangentVector, expr op_fragment:#Float._value which seems to be incorrect to me. @adrian-prantl please verify :)

asl · 2022-04-11T15:07:16Z

After some plumbing with cross compiling I was able to reproduce linux issue with cross-compiler. Looking

asl · 2022-04-11T16:06:09Z

Looks like the problem is exposed by SIL SROA as it uses SILDebugVariable::createFromAllocation(const AllocationInst *AI). And we're having here:

  // Coalesce the debug loc attached on AI into VarInfo
  SILType Type = AI->getType();

I guess the real question is: per SIL specification we do not need op_deref on debug_value for alloca instruction. What if the debug info is attached to alloca itself? Do we need to use pointer type or value type? I'd lean for value type as things like type $*Model.TangentVector, expr op_fragment:#Float._value do not make much sense to me.

Another question is as follows:

bb0(%0 : $*TestType.ManualTestTangent, %1 : $_AD__$s4main8TestTypeV24doDifferentiableTimeStep04timeG0ySf_tF_bb6__PB__src_0_wrt_0_1):
  %2 = alloc_stack $TestType.ManualTestTangent, var, name "self", argno 2, implicit // users: %41, %33, %8, %50, %39
  %3 = integer_literal $Builtin.Int64, 0          // user: %4
  %4 = builtin "sitofp_Int64_FPIEEE32"(%3 : $Builtin.Int64) : $Builtin.FPIEEE32 // user: %5
  %5 = struct $Float (%4 : $Builtin.FPIEEE32)     // users: %51, %6, %6, %6
  %6 = struct $TestType.TestState.TangentVector (%5 : $Float, %5 : $Float, %5 : $Float) // user: %7
  %7 = struct $TestType.ManualTestTangent (%6 : $TestType.TestState.TangentVector) // user: %8
  store %7 to %2 : $*TestType.ManualTestTangent   // id: %8
  %9 = alloc_stack $TestType.ManualTestTangent, var, name "self", argno 2, implicit // users: %20, %12, %49, %18
  %10 = struct_extract %1 : $_AD__$s4main8TestTypeV24doDifferentiableTimeStep04timeG0ySf_tF_bb6__PB__src_0_wrt_0_1, #_AD__$s4main8TestTypeV24doDifferentiableTimeStep04timeG0ySf_tF_bb6__PB__src_0_wrt_0_1.predecessor // user: %14
  %11 = load %0 : $*TestType.ManualTestTangent    // users: %12, %13, %27, %30
  store %11 to %9 : $*TestType.ManualTestTangent  // id: %12
  debug_value %11 : $TestType.ManualTestTangent, var, name "self", argno 2, implicit // id: %13

Here we're having have two locations for "self" – one in alloca w/o explicit type and another one on debug_value. Do we consider this debug info as consistent?

BradLarson · 2022-04-12T19:12:50Z

@swift-ci Please test.

rxwei · 2022-04-13T07:14:38Z

I'll defer to @adrian-prantl on the question above.

adrian-prantl · 2022-04-13T15:28:32Z

You might want to enable -sil-print-debuginfo so you can be sure that both instructions refer to the same self and one isn't an inlined self from another method. I also think the type used by the alloc_stack is hidden from sight without this option. There should be an explicit type $*TestType.ManualTestTangent there.

asl · 2022-04-13T15:34:06Z

@adrian-prantl Yes, they refer to the same self – this is a compiler-generated code (from autodiff routines). And I double verified: there is no explicit type there:

  %2 = alloc_stack $TestType.ManualTestTangent, var, name "self", argno 2, implicit, loc "sr15849-invalid-debug-info.swift":82:16, scope 309 // users: %41, %33, %8, %50, %39

adrian-prantl · 2022-04-13T15:45:07Z

Right, sorry. If the explicit type is missing, the type is the one from of the alloc_stack.

I guess the real question is: per SIL specification we do not need op_deref on debug_value for alloca instruction. What if the debug info is attached to alloca itself? Do we need to use pointer type or value type? I'd lean for value type as things like type $*Model.TangentVector, expr op_fragment:#Float._value do not make much sense to me.

When it's attached to the alloca and there's no expression, the value type from the alloca instruction is used.

Here we're having have two locations for "self" – one in alloca w/o explicit type and another one on debug_value. Do we consider this debug info as consistent?

Yes. There can be many debug_value instructions for one variable. This is particularly important for optimized code as the debug_values follow the value around as it's moved from one SSA value to another. In your example the debug_value is redundant, but benign.

asl · 2022-04-13T15:47:00Z

When it's attached to the alloca and there's no expression, the value type from the alloca instruction is used.

How we'd treat expr op_fragment:#Float._value then? type $*Model.TangentVector, expr op_fragment:#Float._value does look strange, no?

asl · 2022-04-13T15:49:20Z

Yes. There can be many debug_value instructions for one variable. This is particularly important for optimized code as the debug_values follow the value around as it's moved from one SSA value to another. In your example the debug_value is redundant, but benign.

However, here (in example) we're having clash in SIL debug info verifier. As first it sees alloca's type (pointer one) and after this – debug_value one (value type). Hence the clash and incompatibility :)

adrian-prantl · 2022-04-13T15:53:12Z

@mshockwave do you happen to remember the design decisions that went into pointer versus value types?

adrian-prantl · 2022-04-13T15:55:28Z

I think that the debug_value should really be

debug_value %11 : $*TestType.ManualTestTangent, var, name "self", argno 2, implicit, expr op_deref

asl · 2022-04-13T15:57:36Z

@adrian-prantl it cannot be – %11 is of $TestType.ManualTestTangent type and it's created by InstructionDeleter via salvageDebugInfo. Also sil-mem2reg promotes pointers to value types

adrian-prantl · 2022-04-13T20:37:00Z

Sorry, I should really read your example more carefully. %11 is the load of the alloca, not the alloca, so the value type is correct.

adrian-prantl · 2022-04-13T20:38:28Z

What I meant to say is that the debug info in
%2 = alloc_stack $TestType.ManualTestTangent, var, name "self", argno 2, implicit
and
debug_value %2 : $*TestType.ManualTestTangent, var, name "self", argno 2, implicit, expr op_deref
are equivalent.

asl · 2022-04-13T20:42:28Z

@adrian-prantl Great, thanks! What do we do with op_fragment's? Is this a valid debug info or not?

  %2 = alloc_stack $Builtin.FPIEEE32, var, (name "model", loc "/home/build-user/swift/test/AutoDiff/validation-test/inout_control_flow.swift":43:13, scope 0), argno 1, type $*Model.TangentVector, expr op_fragment:#Float._value // users: %145, %58, %65, %80, %87, %71, %93, %10, %117, %106

adrian-prantl · 2022-04-13T20:44:48Z

The fragment looks fine to me and is consistent with the tests: https://github.com/apple/swift/blob/b5ed5f81409887abfe449a55274d88394a58f317/test/DebugInfo/sroa_mem2reg.sil#L60

asl · 2022-04-13T20:53:55Z

So, op_fragment should imply op_deref then? The last (Linux) test fail was because we had pointer type + op_fragment and corresponding value type debug info for the same incoming arg. This caused debug info verifier to fail.

In my last commit I changed sroa to use value type debug info (and fixed tests)

adrian-prantl · 2022-04-14T00:14:08Z

In this example:

debug_value %[[FIELD_X]] : $Int64, var, (name "my_struct", loc "sroa.swift":8:9, scope 2), type $*MyStruct, expr op_fragment:#MyStruct.x

there is no implied deref. The debug_value describes the new location of the x field of MyStruct whose value is in FIELD_X.

asl · 2022-04-14T08:17:41Z

@adrian-prantl Fair enough and thanks for the clarification (my mental model was "set of operations to obtain a value of variable"). Then I guess the debug info verifier should be fixed as it is not clear what it should check conceptually

Consider the following cases:

Changed location of a variable / parameter

  %2 = alloc_stack $Foo, var, name "self", argno 2
...
  %11 = load %2 : $*Foo
  debug_value %11 : $Foo, var, name "self", argno 2

Same, but with explicit debug_value on alloca:

  %2 = alloc_stack $Foo
  debug_value %2 : $*Foo, var, name "self", argno 2
...
  %11 = load %2 : $*Foo
  debug_value %11 : $Foo, var, name "self", argno 2

Same, but with op_fragment and explicit type of the field:

  %2 = alloc_stack $Builtin.FPIEEE32, var, name "self", argno 2, type $*Foo, expr op_fragment:#Foo.x
...
  %11 = load %2 : $*Builtin.FPIEEE32
  debug_value %11 : $Builtin.FPIEEE32, var, name "self", argno 2

It seems that we'd need to check that the debug info for the value is consistent, therefore:

In this case we'd need to obtain the object type out of alloca and compare with it
Here we'd need to walk debug_value to understand that this is a variable declaration and decide whether obtain object type or not
Here we'd need to "unfold" op_fragment, obtain type of x field of Foo and use for checking

Does this look as the intended behaviour and consistent with the implied debug info verified design? Or maybe I'm missing something?

mshockwave · 2022-04-14T17:20:19Z

There seems to be some traffic after I was tagged, so I'll try to clarify from the earlier one:

What if the debug info is attached to alloca itself? Do we need to use pointer type or value type? I'd lean for value type as things like type $*Model.TangentVector, expr op_fragment:#Float._value do not make much sense to me.

Given the original alloc_stack instruction

%3 = alloc_stack $Builtin.FPIEEE32, var, (name "model", loc "...", scope 0), argno 1, type $*Model.TangentVector, expr op_fragment:#Float._value

The suggested mental model here: %3 has type $Builtin.FPIEEE32, this SSA value is the fragment of another aggregate-type source varialbe, "model". Source variable "model" has type $*Model.TangentVector. And %3 is a fragment located at field Float._value. I think the key is how we interpret op_fragment: "%3 is the fragment of <source variable>", rather than "taking fragment from %3".

Another highlight is that "model", as well as the following loc, is the name and location for the aggregate type source variable, rather than that for %3. That's why we put brackets around it to distinguish it from "normal" use cases like:

%8 = alloc_stack $Int32, var, name "hello", loc foo.swift

In which case the source variable name of %8 is really "hello".

I agree this is not so straightforward, but the DWARF standard coined this (model) and both LLVM and SIL are following it.

Same, but with op_fragment and explicit type of the field:
%2 = alloc_stack $Builtin.FPIEEE32, var, name "self", argno 2, type $*Foo, expr op_fragment:#Foo.x
...
%11 = load %2 : $*Builtin.FPIEEE32
debug_value %11 : $*Builtin.FPIEEE32, var, name "self", argno 2

Sorry but I think the type of %11 is wrong...it should be $Builtin.FPIEEE32 right?

Same, but with explicit debug_value on alloca:
%2 = alloc_stack $Foo
debug_value %2 : $*Foo, var, name "self", argno 2
...
%11 = load %2 : $*Foo
debug_value %11 : $Foo, var, name "self", argno 2

Slightly tangent: But is the second debug_value instruction necessary? Because I feel like inside a compiler, to verify whether the pointee of %2 is "self", in the worst case we need to do a alias analysis (of course in the snippet above it's trivial to track, but what if we're loading from some SSA value other than %2?). But the DWARF generated at the end will simply tag the corresponding stack slot with "self". And inside a debugger -- since we're in run-time -- we don't need any alias analysis to know if an arbitrary value is "self".

asl · 2022-04-14T17:28:43Z

@mshockwave Thanks for the clarifications!

Sorry but I think the type of %11 is wrong...it should be $Builtin.FPIEEE32 right?

Yes, typo, should be value type, not pointer.

Slightly tangent: But is the second debug_value instruction necessary?

Well, I'm essentially copying real cases I've seen in wild :) The debug_value in question is left after sil-mem2reg, when it removes alloca (originally we had like 2-3 allocas with same debug value attached to it), but saves debug info.

… need to drop op_deref expression

asl · 2022-04-19T15:28:38Z

@mshockwave @adrian-prantl Thanks for the clarifications! I refined the patch and relaxed the checks in SIL verifier to accommodate for the location changes like in case 1 above. Please review.

PS: It seems we have plenty of cases when op_dered is applied to non-address types (so I cannot simply resolve FIXME). I will file a separate issue for this.

adrian-prantl · 2022-04-19T16:42:38Z

LGTM

asl · 2022-04-19T16:46:29Z

@adrian-prantl Will you please ping CI for me? :)

adrian-prantl · 2022-04-19T16:47:03Z

@swift-ci test

asl · 2022-04-21T14:38:36Z

@adrian-prantl @BradLarson @rxwei Will you please merge the PR for me? Thanks!

…tack: we need to drop op_deref expression (#42245)" This reverts commit c4ebb32.

asl changed the title ~~[AutoDiff][DebugInfo] Properly transfer location info from debug_value into alloc_stack~~ [AutoDiff][DebugInfo] Properly transfer location info from debug_value into alloc_stack in pullback cloner Apr 7, 2022

rxwei requested review from rxwei and adrian-prantl April 7, 2022 17:34

adrian-prantl reviewed Apr 7, 2022

View reviewed changes

lib/SILOptimizer/Differentiation/Common.cpp Show resolved Hide resolved

test/AutoDiff/compiler_crashers_fixed/sr15849-invalid-debug-info.swift Outdated Show resolved Hide resolved

asl requested a review from adrian-prantl April 8, 2022 17:20

adrian-prantl reviewed Apr 8, 2022

View reviewed changes

test/AutoDiff/compiler_crashers_fixed/sr15849-invalid-debug-info.swift Show resolved Hide resolved

adrian-prantl approved these changes Apr 8, 2022

View reviewed changes

rxwei approved these changes Apr 8, 2022

View reviewed changes

asl force-pushed the sr15489 branch from 0dd0d74 to 1318e44 Compare April 11, 2022 21:39

Properly transfer location info from debug_value into alloc_stack: we…

76e19d3

… need to drop op_deref expression

asl force-pushed the sr15489 branch from 1318e44 to 76e19d3 Compare April 19, 2022 15:26

rxwei merged commit c4ebb32 into swiftlang:main Apr 21, 2022

compnerd added a commit that referenced this pull request Apr 22, 2022

Revert "Properly transfer location info from debug_value into alloc_s…

ffd1f83

…tack: we need to drop op_deref expression (#42245)" This reverts commit c4ebb32.

compnerd mentioned this pull request Apr 22, 2022

Revert "[AutoDiff][DebugInfo] Properly transfer location info from debug_value into alloc_stack in pullback cloner" #42561

Closed

BradLarson mentioned this pull request Apr 26, 2022

[SR-15849] [AutoDiff] Mutating functions with control flow can cause assertion failure for conflicting debug variable type #58123

Closed

BradLarson mentioned this pull request May 4, 2022

[AutoDiff] Specifically-shaped differentiable functions yield "conflicting debug info for argument" assertion failure #58660

Closed

[AutoDiff][DebugInfo] Properly transfer location info from debug_value into alloc_stack in pullback cloner #42245

[AutoDiff][DebugInfo] Properly transfer location info from debug_value into alloc_stack in pullback cloner #42245

Uh oh!

Conversation

asl commented Apr 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asl commented Apr 7, 2022

Uh oh!

rxwei commented Apr 7, 2022

Uh oh!

asl commented Apr 7, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adrian-prantl left a comment

Choose a reason for hiding this comment

Uh oh!

asl commented Apr 8, 2022

Uh oh!

BradLarson commented Apr 8, 2022

Uh oh!

rxwei commented Apr 8, 2022

Uh oh!

rxwei commented Apr 9, 2022

Uh oh!

rxwei commented Apr 10, 2022

Uh oh!

asl commented Apr 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asl commented Apr 11, 2022

Uh oh!

asl commented Apr 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BradLarson commented Apr 12, 2022

Uh oh!

rxwei commented Apr 13, 2022

Uh oh!

adrian-prantl commented Apr 13, 2022

Uh oh!

asl commented Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrian-prantl commented Apr 13, 2022

Uh oh!

asl commented Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asl commented Apr 13, 2022

Uh oh!

adrian-prantl commented Apr 13, 2022

Uh oh!

adrian-prantl commented Apr 13, 2022

Uh oh!

asl commented Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrian-prantl commented Apr 13, 2022

Uh oh!

adrian-prantl commented Apr 13, 2022

Uh oh!

asl commented Apr 13, 2022

Uh oh!

adrian-prantl commented Apr 13, 2022

Uh oh!

asl commented Apr 13, 2022

Uh oh!

adrian-prantl commented Apr 14, 2022

Uh oh!

asl commented Apr 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshockwave commented Apr 14, 2022

Uh oh!

asl commented Apr 14, 2022

Uh oh!

asl commented Apr 19, 2022

asl commented Apr 7, 2022 •

edited

Loading

asl commented Apr 10, 2022 •

edited

Loading

asl commented Apr 11, 2022 •

edited

Loading

asl commented Apr 13, 2022 •

edited

Loading

asl commented Apr 13, 2022 •

edited

Loading

asl commented Apr 13, 2022 •

edited

Loading

asl commented Apr 14, 2022 •

edited

Loading