-
Notifications
You must be signed in to change notification settings - Fork 5.9k
KT-55178 Performance improvement of KFunction.callBy #4840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
arguments | ||
} | ||
|
||
private fun getAbsentArguments(): Array<Any?> = _absentArguments().clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since clone
processing is required, it is defined as a function rather than a property.
core/reflection.jvm/src/kotlin/reflect/jvm/internal/KCallableImpl.kt
Outdated
Show resolved
Hide resolved
// set absent values | ||
parameters.forEach { parameter -> | ||
if (parameter.isOptional && !parameter.type.isInlineClassType) { | ||
// For inline class types, the javaType refers to the underlying type of the inline class, | ||
// but we have to pass null in order to mark the argument as absent for InlineClassAwareCaller. | ||
arguments[parameter.index] = defaultPrimitiveValue(parameter.type.javaType) | ||
} else if (parameter.isVararg) { | ||
arguments[parameter.index] = defaultEmptyArray(parameter.type) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By setting the abscent value in advance, processing can be omitted at the time of the call.
Even if the argument is set, it will be overwritten, so there is no effect.
@udalov If you are concerned about the amount of changes or increased memory consumption due to the introduction of cache, I can change the content as follows |
Sorry for the delay! I agree, this optimization can be very useful. I think we can go further and make the |
Thanks for the review. I agree with your suggestions. I will revise and push this when I have time. |
@udalov Benchmark results for 1.7.20, before and after fix, are as follows 1.7.20
before fix
after fix
There was not much difference between before and after the fix. In addition, I also benchmarked the results of limited changes (made to
In this case, the improvement was relatively small. |
core/reflection.jvm/src/kotlin/reflect/jvm/internal/KCallableImpl.kt
Outdated
Show resolved
Hide resolved
Thanks! I've also noticed that now we perform integer boxing when computing the mask for every default argument and I'd like to avoid it. What do you think about the following (hopefully last) optimization 388fdc4? |
It does not seem to be a good fix, at least as far as the benchmark results are concerned. Even with this optimization, the speedup appears to be only when many default arguments are used.
When 388fdc4 is imported
|
core/reflection.jvm/src/kotlin/reflect/jvm/internal/KCallableImpl.kt
Outdated
Show resolved
Hide resolved
val masks = ArrayList<Int>(1) | ||
var index = 0 | ||
var anyOptional = false | ||
val parameterSize = parameters.size + (if (isSuspend) 1 else 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following optimization to no-argument functions was found to be nearly twice as fast as 638d89c.
Does this optimization look like it should be incorporated?
Benchmark Mode Cnt Score Error Units
Measurement.zero thrpt 4 22299885.316 ± 295995.533 ops/s
Measurement.fiveWithDefault thrpt 4 1927106.356 ± 225032.224 ops/s
Measurement.fiveWithoutDefault thrpt 4 2460623.502 ± 96649.714 ops/s
Measurement.oneWithDefault thrpt 4 5384826.666 ± 340254.100 ops/s
Measurement.oneWithoutDefault thrpt 4 5654229.984 ± 5914104.417 ops/s
Measurement.twentyWithDefault thrpt 4 510341.805 ± 2940.113 ops/s
Measurement.twentyWithoutDefault thrpt 4 739231.653 ± 21298.002 ops/s
val parameterSize = parameters.size + (if (isSuspend) 1 else 0) | |
val parameterSize = parameters.size + (if (isSuspend) 1 else 0) | |
// Optimization for general no-argument functions. | |
if (parameters.isEmpty()) { | |
@Suppress("UNCHECKED_CAST") | |
return reflectionCall { | |
caller.call(if (isSuspend) arrayOf(continuationArgument) else emptyArray()) as R | |
} | |
} else if (parameters.size == 1) { | |
parameters.first().takeIf { it.kind != KParameter.Kind.VALUE }?.let { parameter -> | |
@Suppress("UNCHECKED_CAST") | |
return reflectionCall { | |
caller.call(if (isSuspend) arrayOf(args[parameter], continuationArgument) else arrayOf(args[parameter])) as R | |
} | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the 0-parameters case, it looks good! But for the 1 parameter, I'm not really sure, the code is getting quite complicated already. Besides, Measurement.oneWithoutDefault
regressed in your results after this change, if I understand them correctly? I propose to keep the 0-parameter case only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Measurement.oneWithoutDefault regressed in your results after this change, if I understand them correctly?
I ran the benchmark again with almost the same content, but there did not seem to be any regression in performance.
Since the benchmark was run on a local machine, there must have been something going on in the background during this measurement.
1e1cd3a
Benchmark Mode Cnt Score Error Units
Measurement.fiveWithDefault thrpt 4 1713974.649 ± 807978.785 ops/s
Measurement.fiveWithoutDefault thrpt 4 2402911.282 ± 116905.232 ops/s
Measurement.oneWithDefault thrpt 4 5018352.177 ± 3050333.309 ops/s
Measurement.oneWithoutDefault thrpt 4 6338081.662 ± 384633.083 ops/s
Measurement.twentyWithDefault thrpt 4 506022.842 ± 28381.148 ops/s
Measurement.twentyWithoutDefault thrpt 4 721987.591 ± 78852.517 ops/s
Measurement.zero thrpt 4 12727105.089 ± 188029.864 ops/s
after-zero-arg-opt2
Benchmark Mode Cnt Score Error Units
Measurement.fiveWithDefault thrpt 4 1880153.629 ± 232291.891 ops/s
Measurement.fiveWithoutDefault thrpt 4 2388820.000 ± 97785.718 ops/s
Measurement.oneWithDefault thrpt 4 5276216.078 ± 327025.056 ops/s
Measurement.oneWithoutDefault thrpt 4 6251850.821 ± 497046.067 ops/s
Measurement.twentyWithDefault thrpt 4 516927.278 ± 13448.841 ops/s
Measurement.twentyWithoutDefault thrpt 4 728619.156 ± 36904.744 ops/s
Measurement.zero thrpt 4 21774799.243 ± 1534253.147 ops/s
I agree that it is complicated.
I have pushed for a fix.
68a9741
I did not measure benchmark results with the one-parameter branch removed, but at least there is no reason for the score to drop since the target of the measurement is a top-level function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, the reason I proposed the original form is that getter
is a typical example of a function that takes only instances as arguments.
I've merged all commits and slightly changed comments in 3393bb4. I'll push it to master after the build succeeds. |
Thanks a lot! |
#KT-55178 Fixed Co-authored-by: Alexander Udalov <[email protected]>
Merged to master. This change will be released as a part of Kotlin 1.8.20. Thank you for the contribution! And in particular, thanks for extensive measurement of effects of all the discussed changes. 👍 |
Performance improvements have been made to
KCallableImpl.callDefaultMethod
(KFunction.callBy
).Significance of this improvement
KFunction.callBy
is almost always used when function calls are made usingkotlin-reflect
.Improving this performance will improve the performance of the entire library that depends on
kotlin-reflect
.Improvement Details
The contents of the array used for reflection calls are definitive for the target function.
On the other hand, the current implementation manages arguments dynamically.
In this PR, I have improved performance by caching as much of the processing as possible on the first run, and also by dropping the use of
ArrayList
.Benchmark Results
I have created a simple benchmark project to compare 1.7.0-RC with this change.
The content compares the calling process for functions with 0, 1, 5, and 20 arguments, using all default arguments or not using default arguments.
https://github.com/k163377/kfunction-call-by-benchmark
The results are as follows (for ease of viewing, only the order of the results has been rearranged).
The higher the score, the better.
before
after
For the case where all default arguments are used, there is a significant performance improvement.
For patterns that did not use default arguments, performance was marginally improved, or at least not degraded.
For the no-argument function calls, performance tended to be worse.
I considered optimizing for argumentless functions, but decided against it for the following reasons