Skip to content

[AutoDiff] Implement cross-file lookup of derivatives #58644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 11, 2022
Merged

[AutoDiff] Implement cross-file lookup of derivatives #58644

merged 3 commits into from
May 11, 2022

Conversation

asl
Copy link
Contributor

@asl asl commented May 3, 2022

Look-up for functions with @derivative attributes defined in non-primary source files

Fixes #55170

Look-up for functions with @Derivative attributes defined in non-primary source files

Fixes #55170
@asl asl requested a review from dan-zheng May 3, 2022 18:48
@asl
Copy link
Contributor Author

asl commented May 3, 2022

@swift-ci Please test

@asl asl requested review from rxwei and removed request for dan-zheng May 3, 2022 18:49
@asl
Copy link
Contributor Author

asl commented May 3, 2022

Tagging @BradLarson

@asl asl changed the title Implement cross-file lookup of derivatives [AutoDiff] Implement cross-file lookup of derivatives May 3, 2022
@asl
Copy link
Contributor Author

asl commented May 4, 2022

Ok, look like the patch exposed the lookup cycle we're having here:

  1. When we typecheck @derivative(of:) attribute we need to find the original function decl
  2. This might trigger various requirement checks if the original function decl is generic
  3. One of these checks is differentiability check that also checks what are the derivative function configurations out there
  4. However, typechecking of @derivative(of:) also registers derivative function configuration

This was the cycle before, however it only caused correctness issue (e.g. we might miss the derivative configuration and produce an error). Now the cycle is explicitly exposed as we try to look for derivative configurations in no-primary files as well.

@rxwei Any preferences how to break the cycle? I'm thinking of removing the lookup logic from getDerivativeFunctionConfigurations and instead add "retry" logic into findMinimalDerivativeConfiguration. Alternatively we can to hack around and add a dedicated flag to getDerivativeFunctionConfigurations not to perform cross-module lookup during sema stage.

@@ -8333,6 +8334,33 @@ AbstractFunctionDecl::getDerivativeFunctionConfigurations() {
ctx.loadDerivativeFunctionConfigurations(this, previousGeneration,
*DerivativeFunctionConfigs);
}

class DerivativeFinder : public ASTWalker {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by question: does @derivative attribute finding have an impact on incremental compilation?


When a file is updated, it and all files that depend on it need to be recompiled. Incremental compilation is fast when there are few dependencies between files.

I wonder if @derivative finding can cause incremental compilation to become asymptotically worse. Thinking through an example:

  1. a.swift defines func foo. b.swift exists in the same module, defining a func vjpFoo function with no attributes.
  2. Many files in the same module ask for the derivative of foo, e.g. by calling gradient(of: foo).

Initially, foo has no derivative functions registered via @derivative, so gradient(of: foo) triggers the compiler to automatically generate derivative functions.

  1. func vjpFoo inb.swift is now marked with @derivative(of: foo).

Since all files (a.swift, b.swift, gradient(of: foo) files) are in the same module, the addition of @derivative(of: foo) in b.swift means that foo now has a registered derivative, and all functions that call gradient(of: foo) need to be recompiled†.

Is the † recompilation behavior asymptotically worse than existing incremental compilation? It might not be, if the same amount of work (e.g. recompiling the entire module) was necessary previous to the implementation of correct @derivative attribute finding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It may be that the impact of @derivative attribute finding on incremental compilation was introduced prior to this PR.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dan-zheng The complete recompilation is certainly necessary when optimizations are enabled. As derivatives might be inlined and subsequently optimized. So, effectively we'd need to throw out all the code and start afresh.

I do not like the existing getDerivativeFunctionConfigurations implementation as it has lots of side effects (even before this PR). Though lots of things are computed lazily here and there... Maybe cross-module lookup should be as a last-resort thing as it might be quite expensive on large modules – maybe we'd just iterate once and explicitly register all explicit derivatives.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complete recompilation is certainly necessary when optimizations are enabled.

When optimizations are enabled you're building in whole-module mode anyways. Swift will refuse to perform the incremental build if you ask for -wmo as well.

An even simpler failure mode occurs here since this analysis is not going through name lookup. From the incremental build's perspective, there is no link between the derivative and the original.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When optimizations are enabled you're building in whole-module mode anyways.

I'm confused. Will you please elaborate? The original testcase from #55170 fails regardless whether -O is provided to swiftc.

@asl
Copy link
Contributor Author

asl commented May 4, 2022

@swift-ci please test

@rxwei
Copy link
Contributor

rxwei commented May 4, 2022

Sorry I don't have the bandwidth in the near term to review this. Maybe @dan-zheng can help?

};

// Load derivative configurations from @derivative attributes defined in
// non-primary sources. Note that it might trigger lookup cycles if called
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This analysis needs to go through name lookup. If that's triggering cycles then we need to figure out why getDerivativeFunctionConfigurations is on the lookup path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot by design. See the full description in #58644 (comment) There is no way you could discover custom derivatives via name lookup as name is not known and the original function does not know anything about custom derivatives.

Copy link
Contributor

@CodaFi CodaFi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refactor this code to go through name lookup. You'll get access control, shadowing, and incremental build metadata done for free as a result. If there are cycles that is evidence that autodiff has a layering problem. We should perform minimal validation on the lookup path.

@dan-zheng
Copy link
Contributor

Please refactor this code to go through name lookup. You'll get access control, shadowing, and incremental build metadata done for free as a result. If there are cycles that is evidence that autodiff has a layering problem. We should perform minimal validation on the lookup path.

This current approach for stateful derivative function configuration serialization (#30672) was adapted from prior work for serializing and loading extensions and Obj-C methods.

From current main:

If there's a better approach for handling derivative function configuration lookup that avoids this statefulness, that would be great.


One design choice is that derivative registration is done ① "retroactively" from @derivative functions to the original function, not ② from the original function to @derivative functions – so I'm not sure how to straightforwardly implement "get all derivative function configurations of foo" using name lookup, since foo does not know which @derivative functions have been registered for it (potentially across modules) or what those derivative functions' names are.

// Previous approach ②: derivative registration is done from original function to derivative functions.
// This does not enable retroactive derivative registration, e.g. registering derivatives for tgmath functions.
@differentiable(vjp: vjpFoo)
func foo(_ x: Float) -> Float { x }
func vjpFoo(_ x: Float) -> (value: Float, pullback: (Float) -> Float) { (x, { $0 }) }
// Current approach ①: derivative registration is done from derivative functions to original function.
//
// This supports retroactive registration: derivatives can be registered for functions in other modules.
// For example, `vjpFoo` could be defined in module B, importing module A where `foo` is defined.
//
// However, determining "registered derivatives" for a function is less straightforward, hence the
// `ASTContext::loadDerivativeFunctionConfigurations` complications.
func foo(_ x: Float) -> Float { x }

@derivative(of: foo)
func vjpFoo(_ x: Float) -> (value: Float, pullback: (Float) -> Float) { (x, { $0 }) }

Copy link
Contributor

@dan-zheng dan-zheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR as an incremental fix/improvement looks good!

If tests pass, merging this sounds good to me unless others have concerns.

@asl
Copy link
Contributor Author

asl commented May 8, 2022

@swift-ci please test

@asl asl merged commit a1e138b into main May 11, 2022
@asl asl deleted the 55170-fix branch May 11, 2022 15:18
drexin pushed a commit to drexin/swift that referenced this pull request Jun 3, 2022
Look-up for functions with @Derivative attributes defined in non-primary source files

Fixes swiftlang#55170
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants