-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Detect cold CallTarget invalidation and reset its profile; Limit number of recompilations within a time period #11610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
boolean isValid = installedCode.isValid(); | ||
if (!isValid && installedCode != INVALID_CODE) { | ||
if (getInvalidationReason() == ((HotSpotTruffleRuntime) runtime()).getColdMethodInvalidationReason()) { | ||
invalidateExistingCode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't say I am a fan of this solution. How do we know isValid() is called often enough?
Also it is surprising that isValid() resets the compilation profile and even can call listener methods (listeners could call into isValid() too!)
What did you consider as alternatives here?
Can you explain when / how often this is invoked? Can we call this maybe between compilations on compiler threads?
I don't think we should use this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know isValid() is called often enough?
Do you mean often enough so that we detect "soon" that the call target was invalidated? My understanding is that there is no requirement for how "soon" we detect the invalidation. We just need to eventually detect it and since the isValid
method is always called before a new compilation I decided to implement things like this.
... listeners could call into isValid() too!
The existing code is invalidated before onCompilationDeoptimized
is called, which prevents a recursion.
Can we call this maybe between compilations on compiler threads?
I'm not sure what's your suggestion here. Do you mean iterate on all current compiled call targets to see if any was invalidated?
Can't say I am a fan of this solution.
Please, let me know your suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here is to reset the CT (Call Target) profile when its INSTALLED_CODE was invalidated by HotSpot because it was cold. This needs to happen before we decide to compile the CT again. Realistically speaking this kind of invalidation can only happen between executions, not during an execution.
JVMCI is not able to call a method in Truffle or GraalJIT. Therefore, it cannot downcall to notify that the CT was invalidated. Thus, the identification that the CT was invalidated is a passive event that happens in Truffle.
Approaches that we can use to detect that the CT was invalidated:
-
Check before every execution of the CT in interpreter mode; before we check if the CT should be compiled.
- Pros:
- Relatively simple to implement.
- This is a direct way to achieve what we want: prevent recompilation of cold method.
- Cons:
- Executed every time that we want to execute the CT.
- Pros:
-
Use a separate thread to detect all invalidations.
- Blockers:
- Will require maintaining a list of all compiled non-invalidated CT.
- Blockers:
-
When we are installing a new code.
- Blockers:
- This is too late. There is already a new compilation of the CT.
- Blockers:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you say, ideally we would have a callback from HotSpot on invalidate/flush which allows to take action. We do have such a callback on SubstrateVM for example. Its very unfortunate we don't have that, because we could also use it for better logging.
Thinking loud, it seems the second best solution could be to install code for a call target that calls resetCompilationProfile() which then invalidates itself. This means we wouldn't need to check anything and we would just reset once on the next invocation and not on follow-up invocations. Does that seem feasible to you?
Could we schedule a call for this? It seems like we should do a bit of brain storming on a solution here. (feel free to ping me on slack)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we schedule a call for this? It seems like we should do a bit of brain storming on a solution here. (feel free to ping me on slack)
Absolutely. I'll ping you on slack.
runtime().getListener().onCompilationDeoptimized(this, frame, getInvalidationReasonDescription()); | ||
} | ||
|
||
@TruffleBoundary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this code behind a boundary? If this code is called during PE (not sure it is) then we need to better optimize this code, aka better profiling depending on where its used. If its never called in PE we should remove the boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to make this a TruffleBoundary because an unit test was failing because a compilation was getting too big due to recursive inlining during PE. Since the test was triggering that I suspected that it could eventually affect an end user as well.
*/ | ||
default void onCompilationDeoptimized(OptimizedCallTarget target, Frame frame) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface needs to evolve in a compatible way. Should be easy enough keeping the old method deprecated, each delegating to the other. See for example onCompilationStarted.
@@ -537,6 +540,7 @@ public final RootNode getRootNode() { | |||
public final void resetCompilationProfile() { | |||
this.callCount = 0; | |||
this.callAndLoopCount = 0; | |||
this.timeOfFirstCompilationInWindow = Instant.now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Limit the number of compilations that a CallTarget can have within a time period.
tbh I can't quite follow what the rationale for this feature is. When would you use this?
The maximum number of compilation is intended for compiler bugs, and should never be hit (breakpoint toggling in the debugger for example). If we extend this to time based limit, that defeats the purpose to detect those bugs.
So I think I need a motivating example here to understand.
Also I think setting:
private Instant timeOfFirstCompilationInWindow;
in resetCompilationProfile() seems not very intuitive. Is this really the right place to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I can't quite follow what the rationale for this feature is. When would you use this?
The maximum number of compilation is intended for compiler bugs, and should never be hit (breakpoint toggling in the debugger for example). If we extend this to time based limit, that defeats the purpose to detect those bugs.
My usage scenario may be a little edgey because 1) the application that caused me to bring up this change is very large and the code cache is conservatively small, therefore it causes many method flushes; 2) the recompilation of cold methods (the other issue that this PR addresses) together with (1) was causing many recompilations; 3) new guest language code is added/modified/removed from the system frequently.
I understand that the maximum recompilation limit is just a failsafe and isn't intended to be the main way to limit recompilations and I'm not trying to make it so. What I'm looking for is to make it more flexible, so that the system can remain 'stable' until I have a chance to act on the problem. By stable I mean that there aren't too many recompilations of the call target happening (i.e., a recompilation rate) and also that the method isn't run interpreted until I restart the application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So because the compilation queue is flushed often, you want to limit the maximum compilations per certain time frame? Wouldn't it be better to just reset the successfulCompilationsCount on a flush? Or do I miss something here?
If we never reset the successfulCompilationsCount we will also trigger deopt loop detection which is not limited by a time window.
if (engine.maximumCompilationsWindowInMinutes > 0) { | ||
long ageInMinutes = ChronoUnit.MINUTES.between(timeOfFirstCompilationInWindow, Instant.now()); | ||
if (ageInMinutes >= engine.maximumCompilationsWindowInMinutes) { | ||
// This compilation would have been blocked if the window hadn't overflowed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: use multi line comments for long text.
option("engine.BackgroundCompilation", "false").// | ||
option("engine.CompileImmediately", "true").// | ||
option("engine.MaximumCompilations", "2").// | ||
option("engine.MaximumCompilationsWindow", "1").build()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically we use seconds not minutes. This came to a surprise to me. Any reason this is not seconds?
We could then also spend less time on this test. 90 seconds for such a test is excessive.
Closes: #11045
Please review this patch that modifies Truffle to:
Reset a CallTarget's profile when the
nmethod
associated with it was invalidated by HotSpot because the Code Cache's method flushing heuristics deemed the nmethod to be cold.Limit the number of compilations that a CallTarget can have within a time period. This doesn't change the current behavior, instead it just makes the constraint more flexible. I added a new parameter,
MaximumCompilationsWindow
, that the user can set to specify a time window within which the number of compilations will be limited (currently the window is whole duration of application execution).Tests: