-
Notifications
You must be signed in to change notification settings - Fork 303
Align retry logic for all test framework instrumentations #8803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align retry logic for all test framework instrumentations #8803
Conversation
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 55 metrics, 16 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.025 s) : 0, 1025008
Total [baseline] (8.693 s) : 0, 8692828
Agent [candidate] (1.027 s) : 0, 1026777
Total [candidate] (8.663 s) : 0, 8663356
section iast
Agent [baseline] (1.148 s) : 0, 1148325
Total [baseline] (9.22 s) : 0, 9219589
Agent [candidate] (1.158 s) : 0, 1158164
Total [candidate] (9.269 s) : 0, 9269372
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.151 s) : 0, 1150816
Total [baseline] (9.226 s) : 0, 9225934
Agent [candidate] (1.157 s) : 0, 1157091
Total [candidate] (9.244 s) : 0, 9243831
section iast_TELEMETRY_OFF
Agent [baseline] (1.144 s) : 0, 1143534
Total [baseline] (9.268 s) : 0, 9268188
Agent [candidate] (1.144 s) : 0, 1144152
Total [candidate] (9.242 s) : 0, 9241685
gantt
title insecure-bank - break down per module: candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (686.413 ms) : 0, 686413
BytebuddyAgent [candidate] (686.452 ms) : 0, 686452
GlobalTracer [baseline] (241.792 ms) : 0, 241792
GlobalTracer [candidate] (241.399 ms) : 0, 241399
AppSec [baseline] (55.67 ms) : 0, 55670
AppSec [candidate] (55.504 ms) : 0, 55504
Debugger [baseline] (7.617 ms) : 0, 7617
Debugger [candidate] (9.105 ms) : 0, 9105
Remote Config [baseline] (706.594 µs) : 0, 707
Remote Config [candidate] (707.266 µs) : 0, 707
Telemetry [baseline] (9.215 ms) : 0, 9215
Telemetry [candidate] (9.956 ms) : 0, 9956
section iast
BytebuddyAgent [baseline] (801.341 ms) : 0, 801341
BytebuddyAgent [candidate] (808.821 ms) : 0, 808821
GlobalTracer [baseline] (230.195 ms) : 0, 230195
GlobalTracer [candidate] (232.128 ms) : 0, 232128
IAST [baseline] (30.157 ms) : 0, 30157
IAST [candidate] (28.662 ms) : 0, 28662
AppSec [baseline] (48.727 ms) : 0, 48727
AppSec [candidate] (50.481 ms) : 0, 50481
Debugger [baseline] (5.915 ms) : 0, 5915
Debugger [candidate] (5.955 ms) : 0, 5955
Remote Config [baseline] (611.379 µs) : 0, 611
Remote Config [candidate] (606.116 µs) : 0, 606
Telemetry [baseline] (7.889 ms) : 0, 7889
Telemetry [candidate] (7.881 ms) : 0, 7881
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (802.944 ms) : 0, 802944
BytebuddyAgent [candidate] (807.6 ms) : 0, 807600
GlobalTracer [baseline] (231.083 ms) : 0, 231083
GlobalTracer [candidate] (232.044 ms) : 0, 232044
IAST [baseline] (26.768 ms) : 0, 26768
IAST [candidate] (28.469 ms) : 0, 28469
AppSec [baseline] (52.095 ms) : 0, 52095
AppSec [candidate] (50.712 ms) : 0, 50712
Debugger [baseline] (5.89 ms) : 0, 5890
Debugger [candidate] (5.922 ms) : 0, 5922
Remote Config [baseline] (587.921 µs) : 0, 588
Remote Config [candidate] (600.377 µs) : 0, 600
Telemetry [baseline] (7.911 ms) : 0, 7911
Telemetry [candidate] (8.071 ms) : 0, 8071
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (797.711 ms) : 0, 797711
BytebuddyAgent [candidate] (797.777 ms) : 0, 797777
GlobalTracer [baseline] (230.016 ms) : 0, 230016
GlobalTracer [candidate] (230.494 ms) : 0, 230494
IAST [baseline] (23.038 ms) : 0, 23038
IAST [candidate] (23.639 ms) : 0, 23639
AppSec [baseline] (54.964 ms) : 0, 54964
AppSec [candidate] (54.395 ms) : 0, 54395
Debugger [baseline] (5.96 ms) : 0, 5960
Debugger [candidate] (5.935 ms) : 0, 5935
Remote Config [baseline] (596.058 µs) : 0, 596
Remote Config [candidate] (603.442 µs) : 0, 603
Telemetry [baseline] (7.763 ms) : 0, 7763
Telemetry [candidate] (7.816 ms) : 0, 7816
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.021 s) : 0, 1020985
Total [baseline] (10.549 s) : 0, 10549096
Agent [candidate] (1.022 s) : 0, 1022325
Total [candidate] (10.501 s) : 0, 10500838
section appsec
Agent [baseline] (1.17 s) : 0, 1169540
Total [baseline] (10.753 s) : 0, 10753223
Agent [candidate] (1.165 s) : 0, 1164695
Total [candidate] (10.666 s) : 0, 10666285
section iast
Agent [baseline] (1.157 s) : 0, 1157260
Total [baseline] (10.942 s) : 0, 10942229
Agent [candidate] (1.151 s) : 0, 1151374
Total [candidate] (10.992 s) : 0, 10992161
section profiling
Agent [baseline] (1.28 s) : 0, 1280373
Total [baseline] (10.824 s) : 0, 10824065
Agent [candidate] (1.296 s) : 0, 1296438
Total [candidate] (10.875 s) : 0, 10874763
gantt
title petclinic - break down per module: candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (683.779 ms) : 0, 683779
BytebuddyAgent [candidate] (685.949 ms) : 0, 685949
GlobalTracer [baseline] (240.251 ms) : 0, 240251
GlobalTracer [candidate] (239.997 ms) : 0, 239997
AppSec [baseline] (55.176 ms) : 0, 55176
AppSec [candidate] (55.235 ms) : 0, 55235
Debugger [baseline] (8.325 ms) : 0, 8325
Debugger [candidate] (7.635 ms) : 0, 7635
Remote Config [baseline] (696.902 µs) : 0, 697
Remote Config [candidate] (699.049 µs) : 0, 699
Telemetry [baseline] (9.172 ms) : 0, 9172
Telemetry [candidate] (9.135 ms) : 0, 9135
section appsec
BytebuddyAgent [baseline] (706.454 ms) : 0, 706454
BytebuddyAgent [candidate] (702.815 ms) : 0, 702815
GlobalTracer [baseline] (237.594 ms) : 0, 237594
GlobalTracer [candidate] (237.376 ms) : 0, 237376
AppSec [baseline] (176.632 ms) : 0, 176632
AppSec [candidate] (175.706 ms) : 0, 175706
Debugger [baseline] (5.965 ms) : 0, 5965
Debugger [candidate] (6.312 ms) : 0, 6312
Remote Config [baseline] (631.327 µs) : 0, 631
Remote Config [candidate] (625.295 µs) : 0, 625
Telemetry [baseline] (7.768 ms) : 0, 7768
Telemetry [candidate] (7.392 ms) : 0, 7392
IAST [baseline] (21.625 ms) : 0, 21625
IAST [candidate] (21.562 ms) : 0, 21562
section iast
BytebuddyAgent [baseline] (809.154 ms) : 0, 809154
BytebuddyAgent [candidate] (802.636 ms) : 0, 802636
GlobalTracer [baseline] (231.15 ms) : 0, 231150
GlobalTracer [candidate] (231.84 ms) : 0, 231840
AppSec [baseline] (49.647 ms) : 0, 49647
AppSec [candidate] (48.781 ms) : 0, 48781
Debugger [baseline] (5.951 ms) : 0, 5951
Debugger [candidate] (5.904 ms) : 0, 5904
Remote Config [baseline] (599.335 µs) : 0, 599
Remote Config [candidate] (598.826 µs) : 0, 599
Telemetry [baseline] (7.973 ms) : 0, 7973
Telemetry [candidate] (7.957 ms) : 0, 7957
IAST [baseline] (28.403 ms) : 0, 28403
IAST [candidate] (29.452 ms) : 0, 29452
section profiling
ProfilingAgent [baseline] (103.17 ms) : 0, 103170
ProfilingAgent [candidate] (104.786 ms) : 0, 104786
BytebuddyAgent [baseline] (673.493 ms) : 0, 673493
BytebuddyAgent [candidate] (683.203 ms) : 0, 683203
GlobalTracer [baseline] (375.579 ms) : 0, 375579
GlobalTracer [candidate] (378.876 ms) : 0, 378876
AppSec [baseline] (62.027 ms) : 0, 62027
AppSec [candidate] (62.415 ms) : 0, 62415
Debugger [baseline] (6.265 ms) : 0, 6265
Debugger [candidate] (6.411 ms) : 0, 6411
Remote Config [baseline] (663.154 µs) : 0, 663
Remote Config [candidate] (673.151 µs) : 0, 673
Telemetry [baseline] (8.21 ms) : 0, 8210
Telemetry [candidate] (8.324 ms) : 0, 8324
Profiling [baseline] (103.193 ms) : 0, 103193
Profiling [candidate] (104.81 ms) : 0, 104810
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 19 unstable metrics. Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
dateFormat X
axisFormat %s
section baseline
no_agent (1.36 ms) : 1340, 1379
. : milestone, 1360,
appsec (1.74 ms) : 1716, 1764
. : milestone, 1740,
appsec_no_iast (1.741 ms) : 1717, 1765
. : milestone, 1741,
code_origins (1.665 ms) : 1639, 1692
. : milestone, 1665,
iast (1.504 ms) : 1481, 1528
. : milestone, 1504,
profiling (1.493 ms) : 1470, 1516
. : milestone, 1493,
tracing (1.495 ms) : 1472, 1518
. : milestone, 1495,
section candidate
no_agent (1.362 ms) : 1343, 1381
. : milestone, 1362,
appsec (1.725 ms) : 1702, 1748
. : milestone, 1725,
appsec_no_iast (1.728 ms) : 1705, 1751
. : milestone, 1728,
code_origins (1.664 ms) : 1637, 1691
. : milestone, 1664,
iast (1.517 ms) : 1492, 1541
. : milestone, 1517,
profiling (1.518 ms) : 1495, 1542
. : milestone, 1518,
tracing (1.487 ms) : 1463, 1511
. : milestone, 1487,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
dateFormat X
axisFormat %s
section baseline
no_agent (375.523 µs) : 355, 396
. : milestone, 376,
iast (513.972 µs) : 492, 536
. : milestone, 514,
iast_FULL (733.002 µs) : 711, 755
. : milestone, 733,
iast_GLOBAL (551.753 µs) : 530, 574
. : milestone, 552,
iast_HARDCODED_SECRET_DISABLED (510.909 µs) : 488, 534
. : milestone, 511,
iast_INACTIVE (466.673 µs) : 445, 489
. : milestone, 467,
iast_TELEMETRY_OFF (502.364 µs) : 478, 526
. : milestone, 502,
tracing (462.758 µs) : 441, 485
. : milestone, 463,
section candidate
no_agent (382.547 µs) : 363, 403
. : milestone, 383,
iast (515.463 µs) : 493, 538
. : milestone, 515,
iast_FULL (726.994 µs) : 705, 749
. : milestone, 727,
iast_GLOBAL (556.131 µs) : 535, 578
. : milestone, 556,
iast_HARDCODED_SECRET_DISABLED (523.493 µs) : 501, 546
. : milestone, 523,
iast_INACTIVE (467.57 µs) : 445, 491
. : milestone, 468,
iast_TELEMETRY_OFF (508.088 µs) : 485, 531
. : milestone, 508,
tracing (453.336 µs) : 432, 475
. : milestone, 453,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
dateFormat X
axisFormat %s
section baseline
no_agent (15.009 s) : 15009000, 15009000
. : milestone, 15009000,
appsec (15.013 s) : 15013000, 15013000
. : milestone, 15013000,
iast (18.298 s) : 18298000, 18298000
. : milestone, 18298000,
iast_GLOBAL (18.02 s) : 18020000, 18020000
. : milestone, 18020000,
profiling (15.062 s) : 15062000, 15062000
. : milestone, 15062000,
tracing (15.103 s) : 15103000, 15103000
. : milestone, 15103000,
section candidate
no_agent (15.462 s) : 15462000, 15462000
. : milestone, 15462000,
appsec (14.854 s) : 14854000, 14854000
. : milestone, 14854000,
iast (19.008 s) : 19008000, 19008000
. : milestone, 19008000,
iast_GLOBAL (18.266 s) : 18266000, 18266000
. : milestone, 18266000,
profiling (15.24 s) : 15240000, 15240000
. : milestone, 15240000,
tracing (14.752 s) : 14752000, 14752000
. : milestone, 14752000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
dateFormat X
axisFormat %s
section baseline
no_agent (1.467 ms) : 1455, 1478
. : milestone, 1467,
appsec (2.401 ms) : 2352, 2451
. : milestone, 2401,
iast (2.18 ms) : 2118, 2242
. : milestone, 2180,
iast_GLOBAL (2.229 ms) : 2167, 2292
. : milestone, 2229,
profiling (2.027 ms) : 1978, 2077
. : milestone, 2027,
tracing (2.004 ms) : 1956, 2052
. : milestone, 2004,
section candidate
no_agent (1.471 ms) : 1460, 1483
. : milestone, 1471,
appsec (2.398 ms) : 2349, 2447
. : milestone, 2398,
iast (2.183 ms) : 2121, 2244
. : milestone, 2183,
iast_GLOBAL (2.222 ms) : 2159, 2284
. : milestone, 2222,
profiling (2.489 ms) : 2316, 2661
. : milestone, 2489,
tracing (2.007 ms) : 1958, 2055
. : milestone, 2007,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I like that it actually simplified the instrumentations (apart from Test NG), I was not expecting that!
Do I understand correctly, that Test NG was the only framework where we found problems, which is why test fixtures for the other instrumentations are not being updated?
@@ -3,6 +3,7 @@ | |||
import static datadog.trace.civisibility.Constants.CI_VISIBILITY_INSTRUMENTATION_NAME; | |||
|
|||
import datadog.trace.api.Config; | |||
import datadog.trace.api.civisibility.execution.TestStatus; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line change is due to a refactor. I moved the TestStatus
class to the API package to use it in TestExecutionHistory
, previously it was only internal
@@ -65,6 +68,9 @@ public class TestImpl implements DDTest { | |||
private final TestContext context; | |||
private final TestIdentifier identifier; | |||
|
|||
private Long startMicros; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be a primitive, I think, to avoid unboxing
public void registerExecution(TestStatus status, long durationMillis) { | ||
++executions; | ||
successfulExecutionSeen |= (status != TestStatus.fail); | ||
if (executions > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we not count the first execution here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding of datadog.trace.civisibility.domain.buildsystem.ProxyTestModuleTest
(and the equivalent test for headless modules), the global counter for ATR retries only takes into account actual retries, so we want to avoid increasing it after the first execution of a test. Should it take into account the first execution too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I'd say let's rename totalExecutions
to totalRetries
then and maybe leave a short comment
} | ||
|
||
private boolean currentExecutionIsLast() { | ||
// this could give false negatives if maxExecutions is updated after the current execution is | ||
// registered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it solve the false negative if we change the condition to executions >= maxExecutions - 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is the order in which the check is done. Because this is performed before the execution is registered, we don't have a way of knowing if EFD will update the number of maximum retries, decreasing it under the value of executions
(and therefore making it a false negative after the fact)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, the method is only called inside of datadog.trace.civisibility.execution.RunNTimes#applicable
. Taking into account that now we do want the history to be available for every test execution, in order to correctly register them, it may not make sense anymore to consider the policy as "not applicable" on the last test execution. It might be better to always consider a policy as applicable if it involves retries then, at least while there are retries left. So the method could be removed all together
@@ -162,18 +162,6 @@ public TestExecutionHistory getExecutionHistory(TestIdentifier testIdentifier) { | |||
return executionPolicies.get(testIdentifier); | |||
} | |||
|
|||
@Nullable | |||
public TestExecutionHistory popExecutionHistory(TestIdentifier testIdentifier) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention with this method was to make the object GC-able when the corresponding test finished. Without this we'll keep filling the map with each new test which can create some GC pressure. What is it that prevents us from using this method with the new approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't fully understand how the method was useful, as it was the only framework in which the applicable
method was being called repeatedly for every test execution. But we wouldn't have any issue with adding it back, taking into account the changes to considering a policy as applicable I mentioned in the previous comment. I'll do that then
if (retryAnalyzer instanceof RetryAnalyzer) { | ||
// If DD's retry analyzer is used, create the execution history and report the test result | ||
// beforehand to the tracing listener to align execution ordering with other frameworks. | ||
// Also avoids TestNG marking retried tests as skipped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a small note about the idempotency of invoking the listener multiple times, i.e. explaining that the framework will call our listener again after the retrier is invoked, and why this second call is going to be a no-op
Exactly, TestNG was the only framework which wasn't working as intended, and incorrectly tagging tests. The changes to the other frameworks were mainly done to simplify their logic and align them with the new ordering of retry actions. |
Datadog Summary✅ Code Quality ✅ Code Security Was this helpful? Give us feedback! |
What Does This Do
This PR aligns all the retry logic to be used in the same manner between all framework instrumentations. The logic flow for retries after the refactoring is:
datadog.trace.civisibility.events.TestEventsHandlerImpl#onTestFinish
. This will update the policy logic with the information about the test run to inform later retries.datadog.trace.api.civisibility.execution.TestExecutionHistory#wasLastExecution
. In cases where this is performed by the testing framework before reporting the test result, the test result is manually reported in order to register its execution beforehand.Incidentally, this refactoring also fixes another issue where
TestNG
was marking all retried tests as failed, independently of their result.Motivation
There were several issues with the previous implementation of the retry logic:
TestNG
framework had an ordering disparity with the rest of the frameworks, marking as last execution actually the second to last execution.Additional Notes
Contributor Checklist
type:
and (comp:
orinst:
) labels in addition to any usefull labelsclose
,fix
or any linking keywords when referencing an issue.Use
solves
instead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]