Skip to content

Don't Unnecessarily Invalidate Unreferenced Generic Parameters #41128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 3, 2022

Conversation

philipturner
Copy link
Contributor

@philipturner philipturner commented Feb 1, 2022

Currently, a Decl is set to invalid at the end of TypeChecker::checkReferencedGenericParams. That should not happen, and finally manifested in a crash discovered while testing AutoDiff:

import _Differentiation

struct Q { }

// Should issue a diagnostic, but crashes instead
@derivative(of: remainder(_:_:))
func _vjpRemainder<T: FloatingPoint>(_ x: Q, _ y: Q) -> (
  value: Q, pullback: (Q) -> (Q, Q)
) {
  fatalError()
}

As a result of this fix, error diagnostics for incorrect generic functions slightly change:

class SomeClassWithInvalidMethod {
  // NO LONGER EXISTS: expected-error {{generic parameter 'T' is not used in function signature}}
  // NOW EXISTS: expected-note {{in call to function 'method()'}}
  func method<T>() { 
    // NOW EXISTS: expected-error@-1 {{generic parameter 'T' is not used in function signature}}
    self.method()
    // NOW EXISTS: expected-error@-1 {{generic parameter 'T' could not be inferred}}
  }
}

And, this affects a battle between CSDiag and CSSolver:

// Fix for CSDiag vs CSSolver disagreement on what constitutes a
// valid overload.

// NO LONGER EXISTS: expected-note {{'overloadedMethod(n:)' declared here}}
func overloadedMethod(n: Int) {}

// NO LONGER EXISTS: expected-note {{in call to function 'overloadedMethod()'}}
// NOW EXISTS: {{generic parameter 'T' is not used in function signature}}
func overloadedMethod<T>() {}

// NO LONGER EXISTS: expected-error@-1 {{missing argument for parameter 'n' in call}}
// NOW EXISTS: expected-error@-1 {{generic parameter 'T' could not be inferred}}
overloadedMethod()

Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diagnostic wasn't missing, it's a compiler bug. We should try to fix the bug instead.

@philipturner
Copy link
Contributor Author

philipturner commented Feb 1, 2022

Should a function be passing in an error in the first place?

Copy link
Collaborator

@AnthonyLatsis AnthonyLatsis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test as well?

Comment on lines 4757 to 4769
derivative->getInterfaceType()->getAs<AnyFunctionType>();
if (!derivativeInterfaceType) {
// Crashes if the instance is not an ErrorType.
if (derivative->getInterfaceType()->is<ErrorType>()) {
diags.diagnose(attr->getLocation(),
diag::derivative_attr_unknown_error);
return true;
} else {
// This should never happen, but leaving a runtime diagnostic incase it
// does because little is known about this failure.
llvm::report_fatal_error("Unknown failure in typeCheckDerivativeAttr.");
}
}
Copy link
Collaborator

@AnthonyLatsis AnthonyLatsis Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we encounter an ErrorType here, it means we already produced an appropriate diagnostic elsewhere (see first line in the logs). I think we should simply bail out if there's an error in the interface type. Also, we should not be expecting getInterfaceType() to return a null type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling Decl->isInvalid() returns false, so guarding against that does nothing for bailing out early. The only way to do so is to check that the pointer is null, then exit early. Since this behavior deserves to be caught, I'm issuing a second diagnostic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getInterfaceType() is not what's returning a null. It's getAs<...> that's returning the null. In fact, I call a function on the result of getInterfaceType() in that conditional block.

Copy link
Collaborator

@AnthonyLatsis AnthonyLatsis Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling Decl->isInvalid() returns false

Are you sure? Decl->isInvalid() on a FuncDecl is equivalent to getInterfaceType()->hasError(). It should have an error with the code in your example.

Copy link
Contributor Author

@philipturner philipturner Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just check hasError(), silently bail if true, otherwise cast to AnyFunctionType as was previously done?

Copy link
Contributor Author

@philipturner philipturner Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just said that Decl->isInvalid doesn't do anything useful. It always returns *false* here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It always returns true here

That is exactly what you need. The declaration is invalid when its interface type contains an ErrorType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited my response.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show me the exact code you're running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior just changed. It now does return true. Maybe I misremembered my experience debugging it.

@philipturner
Copy link
Contributor Author

philipturner commented Feb 1, 2022

I don't have much experience with detecting diagnostics in tests, but I could copy and paste some source code for you to refine.

@AnthonyLatsis
Copy link
Collaborator

I don't have much experience with detecting diagnostics in tests, but I could copy and paste some source code for you to refine.

You can add a test to test/AutoDiff/Sema/derivative_attr_type_checking.swift and run lit on it (see the guide on how to run lit selectively). It will show you the missing diagnostics, and you can add them using the expected-error directive by following other examples in that file.

@philipturner
Copy link
Contributor Author

philipturner commented Feb 1, 2022

How to I incrementally build the compiler without rebuilding the standard library? I tried what it says on the README (only swift-frontend) but it doesn't actually change anything. For example, I change the Swift version text, incrementally build, and it doesn't register a change. @AnthonyLatsis this is why it's taking me several minutes to answer your question.

@AnthonyLatsis
Copy link
Collaborator

AnthonyLatsis commented Feb 1, 2022

How to I incrementally build the compiler without rebuilding the standard library?

You use ninja swift-frontend to incrementally build the frontend (should be run in the ../build/swift-macosx-$(uname -m) directory). Are you using a ninja build?

@philipturner
Copy link
Contributor Author

I've been running this command and it still keeps rebuilding Standard Library modules like SwiftOnoneSupport and _MatchingEngine:

ninja -C ../build/Ninja-RelWithDebInfoAssert/swift-macosx-$(uname -m)

@philipturner
Copy link
Contributor Author

philipturner commented Feb 1, 2022

I did this. It compiled in seconds, then acted like nothing happened. For example, I removed a fatal error, yet the now-removed fatal error still fired.

cd ../build/Ninja-RelWithDebInfoAssert/swift-macosx-$(uname -m) && ninja swift-frontend

I can sort of get around it by pressing ^C when it says it's in the middle of building something I know is stdlib.

@AnthonyLatsis
Copy link
Collaborator

What exactly are you doing to run the compiler? swift-frontend is the right target if you want to build just the compiler.

@philipturner
Copy link
Contributor Author

../build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin/swift -Xfrontend -requirement-machine-protocol-signatures=on -Xfrontend -requirement-machine-inferred-signatures=on -Xfrontend -requirement-machine-abstract-signatures=on -Xfrontend -enable-experimental-forward-mode-differentiation ../hello_world_test/file.swift

@AnthonyLatsis
Copy link
Collaborator

AnthonyLatsis commented Feb 1, 2022

If it's build/swift-macosx-$(uname -m)/bin/swift-frontend or build/swift-macosx-$(uname -m)/bin/swiftc you're running after building, and the output is still not reflecting your edits, then I'm confused.

@philipturner
Copy link
Contributor Author

And passing in "swift" to ninja gives an error.

@AnthonyLatsis
Copy link
Collaborator

swift-frontend builds both the frontend and swiftc. There is no swift target. Does it not print anything if you add something like llvm::errs()<<"Hello World\n"; to the code path you're testing?

@philipturner
Copy link
Contributor Author

It says it's compiling, and I can tell because it shows errors if I take a semicolon off of something. The problem might be that my command starts with this:

../build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin/swift

I'm not using swift-frontend or swiftc.

@philipturner philipturner requested a review from rxwei February 1, 2022 20:41
@philipturner philipturner changed the title [AutoDiff] Add missing runtime diagnostic [AutoDiff] Fix unnecessary crash Feb 1, 2022
@AnthonyLatsis
Copy link
Collaborator

AnthonyLatsis commented Feb 1, 2022

It should be fine, bin/swift is an alias of bin/swift-frontend.

@philipturner
Copy link
Contributor Author

It actually isn't a direct alias.

(base) philipturner@philipsm1maxmbp swift % ../build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin/swift-frontend -Xfrontend -requirement-machine-protocol-signatures=on -Xfrontend -requirement-machine-inferred-signatures=on -Xfrontend -requirement-machine-abstract-signatures=on -Xfrontend -enable-experimental-forward-mode-differentiation ../hello_world_test/file.swift
<unknown>:0: error: unknown argument: '-Xfrontend'
<unknown>:0: error: unknown argument: '-Xfrontend'
<unknown>:0: error: unknown argument: '-Xfrontend'
<unknown>:0: error: unknown argument: '-Xfrontend'
(base) philipturner@philipsm1maxmbp swift % ../build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin/swift-frontend -requirement-machine-protocol-signatures=on -requirement-machine-inferred-signatures=on -requirement-machine-abstract-signatures=on -enable-experimental-forward-mode-differentiation ../hello_world_test/file.swift 
<unknown>:0: error: no frontend action was selected

@philipturner
Copy link
Contributor Author

I grabbed the hard-coded program arguments for swift-frontend (pasted below). I messed with the C++ code, but nothing changed:

/Users/philipturner/Documents/Swift-Compiler/swift-project/build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin/swift-frontend -frontend -interpret ../hello_world_test/file.swift -Xllvm -aarch64-use-tbi -enable-objc-interop -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.1.sdk -color-diagnostics -new-driver-path /Users/philipturner/Documents/Swift-Compiler/swift-project/build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin/swift-driver -requirement-machine-protocol-signatures=on -requirement-machine-inferred-signatures=on -requirement-machine-abstract-signatures=on -enable-experimental-forward-mode-differentiation -resource-dir /Users/philipturner/Documents/Swift-Compiler/swift-project/build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/lib/swift -module-name file -target-sdk-version 12.1

@philipturner
Copy link
Contributor Author

philipturner commented Feb 1, 2022

@AnthonyLatsis could you try incrementally rebuilding the compiler yourself (after a modification that breaks it) and attaching the build log?

@AnthonyLatsis
Copy link
Collaborator

I can reproduce the crash if I run ninja swift-frontend && bin/swift-frontend -typecheck path_to_file with my build, and fix it by applying your changes and running the same commands. It also works if I use swiftc or swift.

@philipturner
Copy link
Contributor Author

This happens:

(base) philipturner@philipsm1maxmbp swift-macosx-arm64 % ninja swift-frontend && bin/swift-frontend -typecheck ../../../hello_world_test/file.swift
ninja: no work to do.
<unknown>:0: error: cannot load underlying module for 'Darwin'
<unknown>:0: note: did you forget to set an SDK using -sdk or SDKROOT?
<unknown>:0: note: use "xcrun swiftc" to select the default macOS SDK installed with Xcode

@philipturner
Copy link
Contributor Author

I can reproduce the crash

I'm not talking about reproducing the crash. I'm talking about this:

  1. You implement my changes that fix the crash
  2. You fire up the Swift compiler and it's fine
  3. You catastrophically destroy the C++ code, while letting it technically compile
  4. You fire up the Swift compiler and it's fine!

@AnthonyLatsis
Copy link
Collaborator

This happens...

This is expected, my commands are enough to reproduce the crash and fix it, but you need to specify an SDK to properly compile AutoDiff code. bin/swift seems to be using the default SDK that ships with Xcode.

I'm not talking about reproducing the crash. I'm talking about this...

What are the edits to the C++ code, and what is the unexpected output?

@AnthonyLatsis
Copy link
Collaborator

Can you try running the following command and see it the fatal error appears?

/Users/philipturner/Documents/Swift-Compiler/swift-project/build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin/swift-frontend -frontend -interpret ../../../hello_world_test/file.swift -Xllvm -aarch64-use-tbi -enable-objc-interop -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.1.sdk -color-diagnostics -resource-dir /Users/philipturner/Documents/Swift-Compiler/swift-project/build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/lib/swift -module-name file -target-sdk-version 12.1

@philipturner
Copy link
Contributor Author

I ran ninja swift-frontend, then your command, and no difference.

@philipturner
Copy link
Contributor Author

philipturner commented Feb 1, 2022

About the new bug I'm facing, I've narrowed it down further. The problem is in the AST stage. When I set the closure's parameter type to Int, the AST says <<error type>> there. But for Void, it doesn't. Void is just as incompatible with differentiation as Int, so shouldn't it be <<error type>> too?

It seems that it's skipping over something in /lib/Sema/TypeChecker.cpp, line 3027. It should make a call to diagnoseInvalidFunctionType that returns true, but it doesn't return true for Void.

It should be throwing a diagnostic error of type differentiable_function_type_invalid_parameter inside the function, which doesn't happen.

@AnthonyLatsis
Copy link
Collaborator

About the new bug I'm facing, I've narrowed it down further. The problem is in the AST stage. When I set the closure's parameter type to Int, the AST says <<error type>> there. But for Void, it doesn't. Void is just as incompatible with differentiation as Int, so shouldn't it be <<error type>> too?

I'll have a look.


Out of curiosity, could you try placing the report_fatal_error at an earlier compilation stage, at the start of, say, Parser::parseDeclFunc, and running it against just a trivial function declaration like func foo() {}?

@philipturner
Copy link
Contributor Author

I'm currently figuring out what's happening at the place it's supposed to throw an error diagnostic. I found a very handy trick that makes debugging so much easier (and doesn't corrupt stdlib builds):

llvm::outs() << "HELLO WORLD!\n";

For example, I found out that it reaches line 580 in both cases (with Int and with Void). But, they should both also reach line 639 and they don't (TypeChecker.cpp).

@AnthonyLatsis
Copy link
Collaborator

Maybe we should move this conversation into another thread - the one we originated from?

You could file a bug so that we can move the discussion to JIRA.

@@ -4749,6 +4749,8 @@ static bool typeCheckDerivativeAttr(ASTContext &Ctx, Decl *D,
// to be enabled.
if (checkIfDifferentiableProgrammingEnabled(Ctx, attr, D->getDeclContext()))
return true;
if (D->isInvalid())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the invalid bit and skipping work in Sema is certainly a way to make crashes go away. But we probably want to identify a root cause and fix that first. There's very few instances where this pattern actually helps. In practice it can hinder many more processes downstream as it either hides bugs, shifts data dependencies, or hides dependencies from the incremental build. We're trying to eliminate usages of this pattern where we can - unfortunately many of them are load bearing.

Copy link
Collaborator

@AnthonyLatsis AnthonyLatsis Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root cause is that we're calling setInvalid() (or setting the interface type to ErrorType) when a generic parameter is not used in the type signature of a function/subscript etc., although there seems to be nothing wrong with leaving the interface type as-is.

Copy link
Contributor Author

@philipturner philipturner Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the real question is: why was typeCheckDerivativeAttr called when an error diagnostic was issued?

Is this correct?

Update: I made this comment while GitHub hadn't registered that the comment by @AnthonyLatsis had been made.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root cause is that we're calling setInvalid() when a generic parameter is not used in the type signature of a function/subscript etc., although there seems to be nothing wrong with leaving the interface type as-is.

Agreed. Start by eliminating that call and seeing what changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AnthonyLatsis I have my mind focused on the new bug I'm to file on JIRA. Would you be able to investigate that call and send over the necessary changes via a code review?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start by eliminating that call and seeing what changes.

Pretty reassuring, running tests locally shows only 2 new generic parameter 'T' could not be inferred errors that are now expected.


@philipturner if you are willing to see this through, you have to delete this call and update the following failing tests (and add the AutoDiff regression test):

Swift(macosx-x86_64) :: Constraints/overload.swift
Swift(macosx-x86_64) :: Generics/unbound.swift

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thanks for narrowing down the crux of the bug!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and add the AutoDiff regression test):

That just means add a new test proving this bug was fixed, correct? If so, I'll need a bit of help setting that up.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a test file for @derivative checking: ../test/AutoDiff/Sema/derivative_attr_type_checking.swift. Just add your test case to that and run lit on it to see the diagnostics that are expected. If the expectations are correct, incorporate the missing diagnostics into your test using expected-* directives, as shown elsewhere in that file. The lit command is

cd ../build/Ninja-RelWithDebInfoAssert/test-macosx-$(uname -m) && ../llvm-project/llvm/utils/lit/lit.py -sv --filter=derivative_attr_type_checking AutoDiff/Sema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philipturner
Copy link
Contributor Author

SR-15808

@philipturner
Copy link
Contributor Author

I got the correct diagnostic using @AnthonyLatsis's suggested patch! It was hidden away by the bug!

../../../hello_world_test/file.swift:6:17: error: cannot find 'remainder' in scope
@derivative(of: remainder(_:_:))

@philipturner
Copy link
Contributor Author

philipturner commented Feb 2, 2022

@CodaFi btw racy_async_let_fibonacci just failed while I was running tests using this command:

# Run all tests under test/.
../llvm-project/llvm/utils/lit/lit.py -s -vv \
  ../build/Ninja-RelWithDebInfoAssert/swift-macosx-$(uname -m)/test-macosx-$(uname -m)

Update: it failed a second time too
And a third time

Now that I think about it, I think it did fail a few times during normal testing.

@AnthonyLatsis
Copy link
Collaborator

I got the correct diagnostic using @AnthonyLatsis's suggested patch!

Well, if it wasn't for @CodaFi, I probably would not mind the root cause, so thanks to Robert!

racy_async_let_fibonacci just failed

Should be unrelated.

@AnthonyLatsis
Copy link
Collaborator

@swift-ci please smoke test macOS

@philipturner
Copy link
Contributor Author

philipturner commented Feb 2, 2022

This is my very first time fixing an AutoDiff bug! Just a hundred more to go 😦

After SR-15808, I'd like to tackle SR-15793.

@philipturner
Copy link
Contributor Author

macOS smoke test failed.

@AnthonyLatsis
Copy link
Collaborator

@swift-ci please clean smoke test macOS

@AnthonyLatsis AnthonyLatsis requested a review from CodaFi February 2, 2022 17:57
Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the latest changes, it is no longer a fix directly related to autodiff. I'd suggest removing "[AutoDiff] " and modifying the PR description to describe how this affects non-autodiff Swift diagnostics.

@philipturner philipturner changed the title [AutoDiff] Fix unnecessary crash Fix unnecessary crash Feb 2, 2022
@philipturner
Copy link
Contributor Author

Is that sufficient?

@rxwei
Copy link
Contributor

rxwei commented Feb 2, 2022

modifying the PR description to describe how this affects non-autodiff Swift diagnostics.

@philipturner
Copy link
Contributor Author

philipturner commented Feb 2, 2022

Fixed.

@CodaFi CodaFi changed the title Fix unnecessary crash Don't Unnecessarily Invalidate Unreferenced Generic Parameters Feb 2, 2022
Copy link
Contributor

@CodaFi CodaFi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better.

@CodaFi
Copy link
Contributor

CodaFi commented Feb 2, 2022

@swift-ci clean smoke test

@rxwei
Copy link
Contributor

rxwei commented Feb 3, 2022

Could you please squash your changes into a single commit?

@AnthonyLatsis
Copy link
Collaborator

We can squash and merge to avoid rerunning tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants