Skip to content

X86: force enable the mandatory tailcall on Win64 #6281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 30, 2023

Conversation

compnerd
Copy link
Member

Enable the guaranteed tailcall on Windows x86_64 when we encounter a Swift function call. This function will not conform to the Windows prologue requirements which will implicitly break the unwinder and the debugger, but is required to enable the proper Concurrency behaviour for non-trivial programs. This requires associated changes in the Swift repository to update the test expectations.

Enable the guaranteed tailcall on Windows x86_64 when we encounter a
Swift function call.  This function will not conform to the Windows
prologue requirements which will implicitly break the unwinder and the
debugger, but is required to enable the proper Concurrency behaviour for
non-trivial programs.  This requires associated changes in the Swift
repository to update the test expectations.
@compnerd
Copy link
Member Author

CC: @TNorthover

@TNorthover
Copy link

Particularly if Swift is going to need changes anyway (but as a matter of principle too), shouldn't we gate this on nounwind? That's what really breaks and other languages could benefit.

I suppose since this is only in Swift's fork for a specific release it's not critical though.

@compnerd
Copy link
Member Author

I'm not sure that is correct though. You can have nounwind on C++ functions, and those should still conform to the proper unwinding semantics. The nounwind attribute already is attached to all the Swift async functions, but it alone is not sufficient for correctness. Basically, this is saying, this is a swift function, go ahead and just ignore all ability to debug code.

@compnerd
Copy link
Member Author

BTW, the swift changes that are needed in the tests: swiftlang/swift#63660

All the points where we attach swiftasynccc on the function, we already attach a nounwind by setDoesNotThrows().

@TNorthover
Copy link

TNorthover commented Feb 14, 2023

Why should a nounwind C++ function conform to unwinding semantics? (Genuinely, but also at a surface level that's a weird statement). Clang applies it in -fno-exceptions situations, so there's not going to be any actual unwinding going on (or do Windows asynchronous exceptions happen whether you like it or not? I don't really know Windows).

If it's ultimately down to debugability rather than actual code breaking, that's what debug info is for. Symbolizers might want to use unwind tables too, but I don't view them as a consumer with a stake in the issue.

Edit: also bear in mind this whole thing is gated by the guaranteed tail call optimisation that messes with the ABI. You don't end up here by accident. You have to select a guaranteed tail callingconv, or use fastcc and set a TailCallOpt or whatever target option.

@compnerd
Copy link
Member Author

Why should a nounwind C++ function conform to unwinding semantics? (Genuinely, but also at a surface level that's a weird statement). Clang applies it in -fno-exceptions situations, so there's not going to be any actual unwinding going on (or do Windows asynchronous exceptions happen whether you like it or not? I don't really know Windows).

Yes, Windows does proper asynchronous exceptions, and those occur whether you like it or not. That is part of the reason that the prologues are so restrictive. They support asynchronous exceptions, and recovery from exceptions in prologue setup.

If it's ultimately down to debugability rather than actual code breaking, that's what debug info is for. Symbolizers might want to use unwind tables too, but I don't view them as a consumer with a stake in the issue.

Oh, this won't be code breaking per se, but if you have code that is reliant on stack walking or symbolicating, that will break as the unwinder and symbolication paths will not be able to handle the frame. I should reword things to make that more clear.

Edit: also bear in mind this whole thing is gated by the guaranteed tail call optimisation that messes with the ABI. You don't end up here by accident. You have to select a guaranteed tail callingconv, or use fastcc and set a TailCallOpt or whatever target option.

I think that fastcc is supported on Windows (and LLVM will sometimes apply that if a function gets internalised), so you can end up here "accidentally" I think.

@TNorthover
Copy link

Yes, Windows does proper asynchronous exceptions, and those occur whether you like it or not. That is part of the reason that the prologues are so restrictive. They support asynchronous exceptions, and recovery from exceptions in prologue setup.

So how does Swift escape these consequences? I had assumed that as long as you didn't do weird things with signal or whatever you wouldn't get asynchronous exceptions. And in tandem that if you passed the Windows equivalent of -fno-exceptions you were promising not to do anything that spawned asynchronous ones.

Oh, this won't be code breaking per se, but if you have code that is reliant on stack walking or symbolicating, that will break as the unwinder and symbolication paths will not be able to handle the frame

This is still sounding like the whole purpose of nounwind to me.

I think that fastcc is supported on Windows (and LLVM will sometimes apply that if a function gets internalised), so you can end up here "accidentally" I think.

That's why I tried to emphasize the target option. Just grepped more and it's GuaranteedTailCallOpt. Without that, fastcc won't end up here, and I don't believe Clang ever sets GuaranteedTailCallOpt. Only a language that really cared for correctness would.

@TNorthover
Copy link

Oh hang on, I've finally taken the time to look at LangRef and see that nounwind basically doesn't apply to Windows: "functions marked nounwind may still trap or generate asynchronous exceptions".

My revised question is why doesn't this diff break that guarantee? The function can no longer take part in the platform's asynchronous exception routing. It's not just about debugability.

@compnerd
Copy link
Member Author

My revised question is why doesn't this diff break that guarantee? The function can no longer take part in the platform's asynchronous exception routing. It's not just about debugability.

It doesn't? My understanding is it does break that. That is why I conditionalised it on swiftasynccc rather than nounwind.

@compnerd
Copy link
Member Author

So how does Swift escape these consequences? I had assumed that as long as you didn't do weird things with signal or whatever you wouldn't get asynchronous exceptions. And in tandem that if you passed the Windows equivalent of -fno-exceptions you were promising not to do anything that spawned asynchronous ones.

signal is not a concept on Windows, so that part we can escape by means of it not being applicable. Outside of that, Swift basically escapes the consequences by means of the ostrich maneuver. As long as the prologue follows the standard format, the unwinder can deal with the asynchronous exception. Outside of that, such as when we start dealing with the guaranteed tail call, we run afoul of those restrictions, and here we are today - trying to sort out how to get that to work :).

It's GuaranteedTailCallOpt. Without that, fastcc won't end up here, and I don't believe Clang ever sets GuaranteedTailCallOpt. Only a language that really cared for correctness would.

Ah, okay. So, that is definitely good to know. I was worried that we might end up here in a different language mode as well. However, given that we would want to upstream this, I still think that preventing another language from accidentally breaking the ABI contract is preferable and restricting the "incorrect" codegen to just Swift's async calls is "safer".

@TNorthover
Copy link

Fair enough. I'm certainly persuaded that this solution is a reasonable option now. LGTM.

@compnerd
Copy link
Member Author

I decided to look in a bit more detail the stack layout requirements for Windows X64. The requirements are:

RDX, RCX, R8, R9 are spilt first to home the argument registers (this gives us the contiguous arguments, windows uses 4 regparm not 6 like SysV), and the stack pointer cannot be adjusted in the prologue.

The current TCO flow is adjust rsp, spill, adjust rsp. We need it to be spill, adjust rsp. This means that if we want to keep synchronous unwinding working, we need to alter the epilogue to shuffle the data into the spill slots by means of reloading them and not popping the CSRs. WDYT @TNorthover?

@compnerd
Copy link
Member Author

@swift-ci please test

@compnerd
Copy link
Member Author

Please test with following PRs:
swiftlang/swift#63660

@swift-ci please test

@compnerd compnerd marked this pull request as ready for review March 30, 2023 21:17
@compnerd
Copy link
Member Author

Going to go ahead and merge this for now; we can iteratively try to improve this for debugging.

@compnerd compnerd merged commit 67e69d5 into swiftlang:stable/20221013 Mar 30, 2023
@compnerd compnerd deleted the async branch March 30, 2023 21:19
@DougGregor
Copy link
Member

Can we pull this into 5.9?

@compnerd
Copy link
Member Author

I might need a bit of help to get that to happen but seems feasible. Fundamentally, there isn't much that should block that. I can try to take a pass at it tomorrow. I think that I did run into some tests on the Swift side that needed a bit of investigation.

@DougGregor
Copy link
Member

5.9 branched a few weeks ago, so hopefully it's not too tricky. Of course, I could ask about 5.8 as well :)

@compnerd
Copy link
Member Author

Hahah, I'd be impressed if we find a way to bring this into 5.8!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants