-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[AutoDiff] Implement cross-file lookup of derivatives #58644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Look-up for functions with @Derivative attributes defined in non-primary source files Fixes #55170
@swift-ci Please test |
Tagging @BradLarson |
Ok, look like the patch exposed the lookup cycle we're having here:
This was the cycle before, however it only caused correctness issue (e.g. we might miss the derivative configuration and produce an error). Now the cycle is explicitly exposed as we try to look for derivative configurations in no-primary files as well. @rxwei Any preferences how to break the cycle? I'm thinking of removing the lookup logic from |
@@ -8333,6 +8334,33 @@ AbstractFunctionDecl::getDerivativeFunctionConfigurations() { | |||
ctx.loadDerivativeFunctionConfigurations(this, previousGeneration, | |||
*DerivativeFunctionConfigs); | |||
} | |||
|
|||
class DerivativeFinder : public ASTWalker { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by question: does @derivative
attribute finding have an impact on incremental compilation?
When a file is updated, it and all files that depend on it need to be recompiled. Incremental compilation is fast when there are few dependencies between files.
I wonder if @derivative
finding can cause incremental compilation to become asymptotically worse. Thinking through an example:
a.swift
definesfunc foo
.b.swift
exists in the same module, defining afunc vjpFoo
function with no attributes.- Many files in the same module ask for the derivative of
foo
, e.g. by callinggradient(of: foo)
.
Initially, foo
has no derivative functions registered via @derivative
, so gradient(of: foo)
triggers the compiler to automatically generate derivative functions.
func vjpFoo
inb.swift
is now marked with@derivative(of: foo)
.
Since all files (a.swift
, b.swift
, gradient(of: foo)
files) are in the same module, the addition of @derivative(of: foo)
in b.swift
means that foo
now has a registered derivative, and all functions that call gradient(of: foo)
need to be recompiled†.
Is the † recompilation behavior asymptotically worse than existing incremental compilation? It might not be, if the same amount of work (e.g. recompiling the entire module) was necessary previous to the implementation of correct @derivative
attribute finding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It may be that the impact of @derivative
attribute finding on incremental compilation was introduced prior to this PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dan-zheng The complete recompilation is certainly necessary when optimizations are enabled. As derivatives might be inlined and subsequently optimized. So, effectively we'd need to throw out all the code and start afresh.
I do not like the existing getDerivativeFunctionConfigurations
implementation as it has lots of side effects (even before this PR). Though lots of things are computed lazily here and there... Maybe cross-module lookup should be as a last-resort thing as it might be quite expensive on large modules – maybe we'd just iterate once and explicitly register all explicit derivatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complete recompilation is certainly necessary when optimizations are enabled.
When optimizations are enabled you're building in whole-module mode anyways. Swift will refuse to perform the incremental build if you ask for -wmo as well.
An even simpler failure mode occurs here since this analysis is not going through name lookup. From the incremental build's perspective, there is no link between the derivative and the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When optimizations are enabled you're building in whole-module mode anyways.
I'm confused. Will you please elaborate? The original testcase from #55170 fails regardless whether -O
is provided to swiftc
.
@swift-ci please test |
Sorry I don't have the bandwidth in the near term to review this. Maybe @dan-zheng can help? |
}; | ||
|
||
// Load derivative configurations from @derivative attributes defined in | ||
// non-primary sources. Note that it might trigger lookup cycles if called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This analysis needs to go through name lookup. If that's triggering cycles then we need to figure out why getDerivativeFunctionConfigurations
is on the lookup path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It cannot by design. See the full description in #58644 (comment) There is no way you could discover custom derivatives via name lookup as name is not known and the original function does not know anything about custom derivatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refactor this code to go through name lookup. You'll get access control, shadowing, and incremental build metadata done for free as a result. If there are cycles that is evidence that autodiff has a layering problem. We should perform minimal validation on the lookup path.
This current approach for stateful derivative function configuration serialization (#30672) was adapted from prior work for serializing and loading extensions and Obj-C methods. From current
If there's a better approach for handling derivative function configuration lookup that avoids this statefulness, that would be great. One design choice is that derivative registration is done ① "retroactively" from // Previous approach ②: derivative registration is done from original function to derivative functions.
// This does not enable retroactive derivative registration, e.g. registering derivatives for tgmath functions.
@differentiable(vjp: vjpFoo)
func foo(_ x: Float) -> Float { x }
func vjpFoo(_ x: Float) -> (value: Float, pullback: (Float) -> Float) { (x, { $0 }) } // Current approach ①: derivative registration is done from derivative functions to original function.
//
// This supports retroactive registration: derivatives can be registered for functions in other modules.
// For example, `vjpFoo` could be defined in module B, importing module A where `foo` is defined.
//
// However, determining "registered derivatives" for a function is less straightforward, hence the
// `ASTContext::loadDerivativeFunctionConfigurations` complications.
func foo(_ x: Float) -> Float { x }
@derivative(of: foo)
func vjpFoo(_ x: Float) -> (value: Float, pullback: (Float) -> Float) { (x, { $0 }) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR as an incremental fix/improvement looks good!
If tests pass, merging this sounds good to me unless others have concerns.
@swift-ci please test |
Look-up for functions with @Derivative attributes defined in non-primary source files Fixes swiftlang#55170
Look-up for functions with
@derivative
attributes defined in non-primary source filesFixes #55170