Skip to content

Align retry logic for all test framework instrumentations #8803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
May 14, 2025

Conversation

daniel-mohedano
Copy link
Contributor

@daniel-mohedano daniel-mohedano commented May 12, 2025

What Does This Do

This PR aligns all the retry logic to be used in the same manner between all framework instrumentations. The logic flow for retries after the refactoring is:

  1. Check if the execution policy is applicable. This is normally only done once before the first test execution to avoid performance overhead if the policy is not going to be applicable for the test.
  2. Check if the failure for the following test execution should be suppressed or not. In the frameworks where this is used after test execution, the value is stored for later use.
  3. Register test execution in the history, currently performed in datadog.trace.civisibility.events.TestEventsHandlerImpl#onTestFinish. This will update the policy logic with the information about the test run to inform later retries.
  4. Check if the test should be retried with datadog.trace.api.civisibility.execution.TestExecutionHistory#wasLastExecution. In cases where this is performed by the testing framework before reporting the test result, the test result is manually reported in order to register its execution beforehand.

Incidentally, this refactoring also fixes another issue where TestNG was marking all retried tests as failed, independently of their result.

Motivation

There were several issues with the previous implementation of the retry logic:

  • The registration of test results in the history was performed after test reporting. EFD can change the maximum number of retries, creating certain edge cases where a test execution was not reported as the last one but was considered as such after the fact.
  • The TestNG framework had an ordering disparity with the rest of the frameworks, marking as last execution actually the second to last execution.

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@daniel-mohedano daniel-mohedano added type: refactoring comp: ci visibility Continuous Integration Visibility labels May 12, 2025
@pr-commenter
Copy link

pr-commenter bot commented May 12, 2025

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master daniel.mohedano/unify-retry-detection
git_commit_date 1747222567 1747224856
git_commit_sha 80eed8e 3268b48
release_version 1.50.0-SNAPSHOT~80eed8ea1f 1.50.0-SNAPSHOT~3268b48ae9
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1747227401 1747227401
ci_job_id 937673473 937673473
ci_pipeline_id 65057017 65057017
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-n8gyx9gk-project-304-concurrent-0-upyoqhpw 6.8.0-1028-aws #30~22.04.1-Ubuntu SMP Sun Apr 20 06:03:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-n8gyx9gk-project-304-concurrent-0-upyoqhpw 6.8.0-1028-aws #30~22.04.1-Ubuntu SMP Sun Apr 20 06:03:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 55 metrics, 16 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.025 s) : 0, 1025008
Total [baseline] (8.693 s) : 0, 8692828
Agent [candidate] (1.027 s) : 0, 1026777
Total [candidate] (8.663 s) : 0, 8663356
section iast
Agent [baseline] (1.148 s) : 0, 1148325
Total [baseline] (9.22 s) : 0, 9219589
Agent [candidate] (1.158 s) : 0, 1158164
Total [candidate] (9.269 s) : 0, 9269372
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.151 s) : 0, 1150816
Total [baseline] (9.226 s) : 0, 9225934
Agent [candidate] (1.157 s) : 0, 1157091
Total [candidate] (9.244 s) : 0, 9243831
section iast_TELEMETRY_OFF
Agent [baseline] (1.144 s) : 0, 1143534
Total [baseline] (9.268 s) : 0, 9268188
Agent [candidate] (1.144 s) : 0, 1144152
Total [candidate] (9.242 s) : 0, 9241685
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.025 s -
Agent iast 1.148 s 123.317 ms (12.0%)
Agent iast_HARDCODED_SECRET_DISABLED 1.151 s 125.809 ms (12.3%)
Agent iast_TELEMETRY_OFF 1.144 s 118.526 ms (11.6%)
Total tracing 8.693 s -
Total iast 9.22 s 526.761 ms (6.1%)
Total iast_HARDCODED_SECRET_DISABLED 9.226 s 533.106 ms (6.1%)
Total iast_TELEMETRY_OFF 9.268 s 575.36 ms (6.6%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.027 s -
Agent iast 1.158 s 131.387 ms (12.8%)
Agent iast_HARDCODED_SECRET_DISABLED 1.157 s 130.314 ms (12.7%)
Agent iast_TELEMETRY_OFF 1.144 s 117.375 ms (11.4%)
Total tracing 8.663 s -
Total iast 9.269 s 606.016 ms (7.0%)
Total iast_HARDCODED_SECRET_DISABLED 9.244 s 580.475 ms (6.7%)
Total iast_TELEMETRY_OFF 9.242 s 578.329 ms (6.7%)
gantt
    title insecure-bank - break down per module: candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (686.413 ms) : 0, 686413
BytebuddyAgent [candidate] (686.452 ms) : 0, 686452
GlobalTracer [baseline] (241.792 ms) : 0, 241792
GlobalTracer [candidate] (241.399 ms) : 0, 241399
AppSec [baseline] (55.67 ms) : 0, 55670
AppSec [candidate] (55.504 ms) : 0, 55504
Debugger [baseline] (7.617 ms) : 0, 7617
Debugger [candidate] (9.105 ms) : 0, 9105
Remote Config [baseline] (706.594 µs) : 0, 707
Remote Config [candidate] (707.266 µs) : 0, 707
Telemetry [baseline] (9.215 ms) : 0, 9215
Telemetry [candidate] (9.956 ms) : 0, 9956
section iast
BytebuddyAgent [baseline] (801.341 ms) : 0, 801341
BytebuddyAgent [candidate] (808.821 ms) : 0, 808821
GlobalTracer [baseline] (230.195 ms) : 0, 230195
GlobalTracer [candidate] (232.128 ms) : 0, 232128
IAST [baseline] (30.157 ms) : 0, 30157
IAST [candidate] (28.662 ms) : 0, 28662
AppSec [baseline] (48.727 ms) : 0, 48727
AppSec [candidate] (50.481 ms) : 0, 50481
Debugger [baseline] (5.915 ms) : 0, 5915
Debugger [candidate] (5.955 ms) : 0, 5955
Remote Config [baseline] (611.379 µs) : 0, 611
Remote Config [candidate] (606.116 µs) : 0, 606
Telemetry [baseline] (7.889 ms) : 0, 7889
Telemetry [candidate] (7.881 ms) : 0, 7881
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (802.944 ms) : 0, 802944
BytebuddyAgent [candidate] (807.6 ms) : 0, 807600
GlobalTracer [baseline] (231.083 ms) : 0, 231083
GlobalTracer [candidate] (232.044 ms) : 0, 232044
IAST [baseline] (26.768 ms) : 0, 26768
IAST [candidate] (28.469 ms) : 0, 28469
AppSec [baseline] (52.095 ms) : 0, 52095
AppSec [candidate] (50.712 ms) : 0, 50712
Debugger [baseline] (5.89 ms) : 0, 5890
Debugger [candidate] (5.922 ms) : 0, 5922
Remote Config [baseline] (587.921 µs) : 0, 588
Remote Config [candidate] (600.377 µs) : 0, 600
Telemetry [baseline] (7.911 ms) : 0, 7911
Telemetry [candidate] (8.071 ms) : 0, 8071
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (797.711 ms) : 0, 797711
BytebuddyAgent [candidate] (797.777 ms) : 0, 797777
GlobalTracer [baseline] (230.016 ms) : 0, 230016
GlobalTracer [candidate] (230.494 ms) : 0, 230494
IAST [baseline] (23.038 ms) : 0, 23038
IAST [candidate] (23.639 ms) : 0, 23639
AppSec [baseline] (54.964 ms) : 0, 54964
AppSec [candidate] (54.395 ms) : 0, 54395
Debugger [baseline] (5.96 ms) : 0, 5960
Debugger [candidate] (5.935 ms) : 0, 5935
Remote Config [baseline] (596.058 µs) : 0, 596
Remote Config [candidate] (603.442 µs) : 0, 603
Telemetry [baseline] (7.763 ms) : 0, 7763
Telemetry [candidate] (7.816 ms) : 0, 7816
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.021 s) : 0, 1020985
Total [baseline] (10.549 s) : 0, 10549096
Agent [candidate] (1.022 s) : 0, 1022325
Total [candidate] (10.501 s) : 0, 10500838
section appsec
Agent [baseline] (1.17 s) : 0, 1169540
Total [baseline] (10.753 s) : 0, 10753223
Agent [candidate] (1.165 s) : 0, 1164695
Total [candidate] (10.666 s) : 0, 10666285
section iast
Agent [baseline] (1.157 s) : 0, 1157260
Total [baseline] (10.942 s) : 0, 10942229
Agent [candidate] (1.151 s) : 0, 1151374
Total [candidate] (10.992 s) : 0, 10992161
section profiling
Agent [baseline] (1.28 s) : 0, 1280373
Total [baseline] (10.824 s) : 0, 10824065
Agent [candidate] (1.296 s) : 0, 1296438
Total [candidate] (10.875 s) : 0, 10874763
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.021 s -
Agent appsec 1.17 s 148.556 ms (14.6%)
Agent iast 1.157 s 136.276 ms (13.3%)
Agent profiling 1.28 s 259.388 ms (25.4%)
Total tracing 10.549 s -
Total appsec 10.753 s 204.127 ms (1.9%)
Total iast 10.942 s 393.133 ms (3.7%)
Total profiling 10.824 s 274.969 ms (2.6%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.022 s -
Agent appsec 1.165 s 142.37 ms (13.9%)
Agent iast 1.151 s 129.048 ms (12.6%)
Agent profiling 1.296 s 274.112 ms (26.8%)
Total tracing 10.501 s -
Total appsec 10.666 s 165.447 ms (1.6%)
Total iast 10.992 s 491.323 ms (4.7%)
Total profiling 10.875 s 373.924 ms (3.6%)
gantt
    title petclinic - break down per module: candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (683.779 ms) : 0, 683779
BytebuddyAgent [candidate] (685.949 ms) : 0, 685949
GlobalTracer [baseline] (240.251 ms) : 0, 240251
GlobalTracer [candidate] (239.997 ms) : 0, 239997
AppSec [baseline] (55.176 ms) : 0, 55176
AppSec [candidate] (55.235 ms) : 0, 55235
Debugger [baseline] (8.325 ms) : 0, 8325
Debugger [candidate] (7.635 ms) : 0, 7635
Remote Config [baseline] (696.902 µs) : 0, 697
Remote Config [candidate] (699.049 µs) : 0, 699
Telemetry [baseline] (9.172 ms) : 0, 9172
Telemetry [candidate] (9.135 ms) : 0, 9135
section appsec
BytebuddyAgent [baseline] (706.454 ms) : 0, 706454
BytebuddyAgent [candidate] (702.815 ms) : 0, 702815
GlobalTracer [baseline] (237.594 ms) : 0, 237594
GlobalTracer [candidate] (237.376 ms) : 0, 237376
AppSec [baseline] (176.632 ms) : 0, 176632
AppSec [candidate] (175.706 ms) : 0, 175706
Debugger [baseline] (5.965 ms) : 0, 5965
Debugger [candidate] (6.312 ms) : 0, 6312
Remote Config [baseline] (631.327 µs) : 0, 631
Remote Config [candidate] (625.295 µs) : 0, 625
Telemetry [baseline] (7.768 ms) : 0, 7768
Telemetry [candidate] (7.392 ms) : 0, 7392
IAST [baseline] (21.625 ms) : 0, 21625
IAST [candidate] (21.562 ms) : 0, 21562
section iast
BytebuddyAgent [baseline] (809.154 ms) : 0, 809154
BytebuddyAgent [candidate] (802.636 ms) : 0, 802636
GlobalTracer [baseline] (231.15 ms) : 0, 231150
GlobalTracer [candidate] (231.84 ms) : 0, 231840
AppSec [baseline] (49.647 ms) : 0, 49647
AppSec [candidate] (48.781 ms) : 0, 48781
Debugger [baseline] (5.951 ms) : 0, 5951
Debugger [candidate] (5.904 ms) : 0, 5904
Remote Config [baseline] (599.335 µs) : 0, 599
Remote Config [candidate] (598.826 µs) : 0, 599
Telemetry [baseline] (7.973 ms) : 0, 7973
Telemetry [candidate] (7.957 ms) : 0, 7957
IAST [baseline] (28.403 ms) : 0, 28403
IAST [candidate] (29.452 ms) : 0, 29452
section profiling
ProfilingAgent [baseline] (103.17 ms) : 0, 103170
ProfilingAgent [candidate] (104.786 ms) : 0, 104786
BytebuddyAgent [baseline] (673.493 ms) : 0, 673493
BytebuddyAgent [candidate] (683.203 ms) : 0, 683203
GlobalTracer [baseline] (375.579 ms) : 0, 375579
GlobalTracer [candidate] (378.876 ms) : 0, 378876
AppSec [baseline] (62.027 ms) : 0, 62027
AppSec [candidate] (62.415 ms) : 0, 62415
Debugger [baseline] (6.265 ms) : 0, 6265
Debugger [candidate] (6.411 ms) : 0, 6411
Remote Config [baseline] (663.154 µs) : 0, 663
Remote Config [candidate] (673.151 µs) : 0, 673
Telemetry [baseline] (8.21 ms) : 0, 8210
Telemetry [candidate] (8.324 ms) : 0, 8324
Profiling [baseline] (103.193 ms) : 0, 103193
Profiling [candidate] (104.81 ms) : 0, 104810
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
end_time 2025-05-14T12:24:45 2025-05-14T12:32:29
git_branch master daniel.mohedano/unify-retry-detection
git_commit_date 1747222567 1747224856
git_commit_sha 80eed8e 3268b48
release_version 1.50.0-SNAPSHOT~80eed8ea1f 1.50.0-SNAPSHOT~3268b48ae9
start_time 2025-05-14T12:24:31 2025-05-14T12:32:15
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1747226345 1747226345
ci_job_id 937673474 937673474
ci_pipeline_id 65057017 65057017
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-xnwlpqwj-project-304-concurrent-0-d9ms5rtk 6.8.0-1028-aws #30~22.04.1-Ubuntu SMP Sun Apr 20 06:03:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-xnwlpqwj-project-304-concurrent-0-d9ms5rtk 6.8.0-1028-aws #30~22.04.1-Ubuntu SMP Sun Apr 20 06:03:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 19 unstable metrics.

Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.36 ms) : 1340, 1379
.   : milestone, 1360,
appsec (1.74 ms) : 1716, 1764
.   : milestone, 1740,
appsec_no_iast (1.741 ms) : 1717, 1765
.   : milestone, 1741,
code_origins (1.665 ms) : 1639, 1692
.   : milestone, 1665,
iast (1.504 ms) : 1481, 1528
.   : milestone, 1504,
profiling (1.493 ms) : 1470, 1516
.   : milestone, 1493,
tracing (1.495 ms) : 1472, 1518
.   : milestone, 1495,
section candidate
no_agent (1.362 ms) : 1343, 1381
.   : milestone, 1362,
appsec (1.725 ms) : 1702, 1748
.   : milestone, 1725,
appsec_no_iast (1.728 ms) : 1705, 1751
.   : milestone, 1728,
code_origins (1.664 ms) : 1637, 1691
.   : milestone, 1664,
iast (1.517 ms) : 1492, 1541
.   : milestone, 1517,
profiling (1.518 ms) : 1495, 1542
.   : milestone, 1518,
tracing (1.487 ms) : 1463, 1511
.   : milestone, 1487,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.36 ms [1.34 ms, 1.379 ms] -
appsec 1.74 ms [1.716 ms, 1.764 ms] 380.599 µs (28.0%)
appsec_no_iast 1.741 ms [1.717 ms, 1.765 ms] 381.799 µs (28.1%)
code_origins 1.665 ms [1.639 ms, 1.692 ms] 305.626 µs (22.5%)
iast 1.504 ms [1.481 ms, 1.528 ms] 144.836 µs (10.7%)
profiling 1.493 ms [1.47 ms, 1.516 ms] 133.693 µs (9.8%)
tracing 1.495 ms [1.472 ms, 1.518 ms] 135.22 µs (9.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.362 ms [1.343 ms, 1.381 ms] -
appsec 1.725 ms [1.702 ms, 1.748 ms] 362.561 µs (26.6%)
appsec_no_iast 1.728 ms [1.705 ms, 1.751 ms] 365.669 µs (26.8%)
code_origins 1.664 ms [1.637 ms, 1.691 ms] 301.87 µs (22.2%)
iast 1.517 ms [1.492 ms, 1.541 ms] 154.544 µs (11.3%)
profiling 1.518 ms [1.495 ms, 1.542 ms] 156.183 µs (11.5%)
tracing 1.487 ms [1.463 ms, 1.511 ms] 124.902 µs (9.2%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
    dateFormat X
    axisFormat %s
section baseline
no_agent (375.523 µs) : 355, 396
.   : milestone, 376,
iast (513.972 µs) : 492, 536
.   : milestone, 514,
iast_FULL (733.002 µs) : 711, 755
.   : milestone, 733,
iast_GLOBAL (551.753 µs) : 530, 574
.   : milestone, 552,
iast_HARDCODED_SECRET_DISABLED (510.909 µs) : 488, 534
.   : milestone, 511,
iast_INACTIVE (466.673 µs) : 445, 489
.   : milestone, 467,
iast_TELEMETRY_OFF (502.364 µs) : 478, 526
.   : milestone, 502,
tracing (462.758 µs) : 441, 485
.   : milestone, 463,
section candidate
no_agent (382.547 µs) : 363, 403
.   : milestone, 383,
iast (515.463 µs) : 493, 538
.   : milestone, 515,
iast_FULL (726.994 µs) : 705, 749
.   : milestone, 727,
iast_GLOBAL (556.131 µs) : 535, 578
.   : milestone, 556,
iast_HARDCODED_SECRET_DISABLED (523.493 µs) : 501, 546
.   : milestone, 523,
iast_INACTIVE (467.57 µs) : 445, 491
.   : milestone, 468,
iast_TELEMETRY_OFF (508.088 µs) : 485, 531
.   : milestone, 508,
tracing (453.336 µs) : 432, 475
.   : milestone, 453,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 375.523 µs [355.158 µs, 395.888 µs] -
iast 513.972 µs [492.052 µs, 535.892 µs] 138.45 µs (36.9%)
iast_FULL 733.002 µs [711.063 µs, 754.941 µs] 357.479 µs (95.2%)
iast_GLOBAL 551.753 µs [529.996 µs, 573.511 µs] 176.231 µs (46.9%)
iast_HARDCODED_SECRET_DISABLED 510.909 µs [488.035 µs, 533.782 µs] 135.386 µs (36.1%)
iast_INACTIVE 466.673 µs [444.506 µs, 488.839 µs] 91.15 µs (24.3%)
iast_TELEMETRY_OFF 502.364 µs [478.254 µs, 526.474 µs] 126.841 µs (33.8%)
tracing 462.758 µs [440.557 µs, 484.959 µs] 87.235 µs (23.2%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 382.547 µs [362.557 µs, 402.537 µs] -
iast 515.463 µs [493.345 µs, 537.581 µs] 132.916 µs (34.7%)
iast_FULL 726.994 µs [705.21 µs, 748.778 µs] 344.447 µs (90.0%)
iast_GLOBAL 556.131 µs [534.665 µs, 577.597 µs] 173.584 µs (45.4%)
iast_HARDCODED_SECRET_DISABLED 523.493 µs [500.744 µs, 546.242 µs] 140.946 µs (36.8%)
iast_INACTIVE 467.57 µs [444.574 µs, 490.566 µs] 85.023 µs (22.2%)
iast_TELEMETRY_OFF 508.088 µs [484.848 µs, 531.328 µs] 125.541 µs (32.8%)
tracing 453.336 µs [431.71 µs, 474.961 µs] 70.789 µs (18.5%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master daniel.mohedano/unify-retry-detection
git_commit_date 1747222567 1747224856
git_commit_sha 80eed8e 3268b48
release_version 1.50.0-SNAPSHOT~80eed8ea1f 1.50.0-SNAPSHOT~3268b48ae9
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1747227012 1747227012
ci_job_id 937673475 937673475
ci_pipeline_id 65057017 65057017
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-5esuzn9-project-304-concurrent-0-2jjql5bz 6.8.0-1028-aws #30~22.04.1-Ubuntu SMP Sun Apr 20 06:03:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-5esuzn9-project-304-concurrent-0-2jjql5bz 6.8.0-1028-aws #30~22.04.1-Ubuntu SMP Sun Apr 20 06:03:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
variant appsec appsec

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.009 s) : 15009000, 15009000
.   : milestone, 15009000,
appsec (15.013 s) : 15013000, 15013000
.   : milestone, 15013000,
iast (18.298 s) : 18298000, 18298000
.   : milestone, 18298000,
iast_GLOBAL (18.02 s) : 18020000, 18020000
.   : milestone, 18020000,
profiling (15.062 s) : 15062000, 15062000
.   : milestone, 15062000,
tracing (15.103 s) : 15103000, 15103000
.   : milestone, 15103000,
section candidate
no_agent (15.462 s) : 15462000, 15462000
.   : milestone, 15462000,
appsec (14.854 s) : 14854000, 14854000
.   : milestone, 14854000,
iast (19.008 s) : 19008000, 19008000
.   : milestone, 19008000,
iast_GLOBAL (18.266 s) : 18266000, 18266000
.   : milestone, 18266000,
profiling (15.24 s) : 15240000, 15240000
.   : milestone, 15240000,
tracing (14.752 s) : 14752000, 14752000
.   : milestone, 14752000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.009 s [15.009 s, 15.009 s] -
appsec 15.013 s [15.013 s, 15.013 s] 4.0 ms (0.0%)
iast 18.298 s [18.298 s, 18.298 s] 3.289 s (21.9%)
iast_GLOBAL 18.02 s [18.02 s, 18.02 s] 3.011 s (20.1%)
profiling 15.062 s [15.062 s, 15.062 s] 53.0 ms (0.4%)
tracing 15.103 s [15.103 s, 15.103 s] 94.0 ms (0.6%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.462 s [15.462 s, 15.462 s] -
appsec 14.854 s [14.854 s, 14.854 s] -608.0 ms (-3.9%)
iast 19.008 s [19.008 s, 19.008 s] 3.546 s (22.9%)
iast_GLOBAL 18.266 s [18.266 s, 18.266 s] 2.804 s (18.1%)
profiling 15.24 s [15.24 s, 15.24 s] -222.0 ms (-1.4%)
tracing 14.752 s [14.752 s, 14.752 s] -710.0 ms (-4.6%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.50.0-SNAPSHOT~3268b48ae9, baseline=1.50.0-SNAPSHOT~80eed8ea1f
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.467 ms) : 1455, 1478
.   : milestone, 1467,
appsec (2.401 ms) : 2352, 2451
.   : milestone, 2401,
iast (2.18 ms) : 2118, 2242
.   : milestone, 2180,
iast_GLOBAL (2.229 ms) : 2167, 2292
.   : milestone, 2229,
profiling (2.027 ms) : 1978, 2077
.   : milestone, 2027,
tracing (2.004 ms) : 1956, 2052
.   : milestone, 2004,
section candidate
no_agent (1.471 ms) : 1460, 1483
.   : milestone, 1471,
appsec (2.398 ms) : 2349, 2447
.   : milestone, 2398,
iast (2.183 ms) : 2121, 2244
.   : milestone, 2183,
iast_GLOBAL (2.222 ms) : 2159, 2284
.   : milestone, 2222,
profiling (2.489 ms) : 2316, 2661
.   : milestone, 2489,
tracing (2.007 ms) : 1958, 2055
.   : milestone, 2007,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.467 ms [1.455 ms, 1.478 ms] -
appsec 2.401 ms [2.352 ms, 2.451 ms] 934.829 µs (63.7%)
iast 2.18 ms [2.118 ms, 2.242 ms] 713.153 µs (48.6%)
iast_GLOBAL 2.229 ms [2.167 ms, 2.292 ms] 762.848 µs (52.0%)
profiling 2.027 ms [1.978 ms, 2.077 ms] 560.84 µs (38.2%)
tracing 2.004 ms [1.956 ms, 2.052 ms] 537.644 µs (36.7%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.471 ms [1.46 ms, 1.483 ms] -
appsec 2.398 ms [2.349 ms, 2.447 ms] 926.873 µs (63.0%)
iast 2.183 ms [2.121 ms, 2.244 ms] 711.339 µs (48.4%)
iast_GLOBAL 2.222 ms [2.159 ms, 2.284 ms] 750.306 µs (51.0%)
profiling 2.489 ms [2.316 ms, 2.661 ms] 1.017 ms (69.2%)
tracing 2.007 ms [1.958 ms, 2.055 ms] 535.424 µs (36.4%)

Copy link
Contributor

@nikita-tkachenko-datadog nikita-tkachenko-datadog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like that it actually simplified the instrumentations (apart from Test NG), I was not expecting that!
Do I understand correctly, that Test NG was the only framework where we found problems, which is why test fixtures for the other instrumentations are not being updated?

@@ -3,6 +3,7 @@
import static datadog.trace.civisibility.Constants.CI_VISIBILITY_INSTRUMENTATION_NAME;

import datadog.trace.api.Config;
import datadog.trace.api.civisibility.execution.TestStatus;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import?

Copy link
Contributor Author

@daniel-mohedano daniel-mohedano May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line change is due to a refactor. I moved the TestStatus class to the API package to use it in TestExecutionHistory, previously it was only internal

@@ -65,6 +68,9 @@ public class TestImpl implements DDTest {
private final TestContext context;
private final TestIdentifier identifier;

private Long startMicros;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be a primitive, I think, to avoid unboxing

public void registerExecution(TestStatus status, long durationMillis) {
++executions;
successfulExecutionSeen |= (status != TestStatus.fail);
if (executions > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we not count the first execution here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding of datadog.trace.civisibility.domain.buildsystem.ProxyTestModuleTest (and the equivalent test for headless modules), the global counter for ATR retries only takes into account actual retries, so we want to avoid increasing it after the first execution of a test. Should it take into account the first execution too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I'd say let's rename totalExecutions to totalRetries then and maybe leave a short comment

}

private boolean currentExecutionIsLast() {
// this could give false negatives if maxExecutions is updated after the current execution is
// registered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it solve the false negative if we change the condition to executions >= maxExecutions - 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is the order in which the check is done. Because this is performed before the execution is registered, we don't have a way of knowing if EFD will update the number of maximum retries, decreasing it under the value of executions (and therefore making it a false negative after the fact)

Copy link
Contributor Author

@daniel-mohedano daniel-mohedano May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, the method is only called inside of datadog.trace.civisibility.execution.RunNTimes#applicable. Taking into account that now we do want the history to be available for every test execution, in order to correctly register them, it may not make sense anymore to consider the policy as "not applicable" on the last test execution. It might be better to always consider a policy as applicable if it involves retries then, at least while there are retries left. So the method could be removed all together

@@ -162,18 +162,6 @@ public TestExecutionHistory getExecutionHistory(TestIdentifier testIdentifier) {
return executionPolicies.get(testIdentifier);
}

@Nullable
public TestExecutionHistory popExecutionHistory(TestIdentifier testIdentifier) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention with this method was to make the object GC-able when the corresponding test finished. Without this we'll keep filling the map with each new test which can create some GC pressure. What is it that prevents us from using this method with the new approach?

Copy link
Contributor Author

@daniel-mohedano daniel-mohedano May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't fully understand how the method was useful, as it was the only framework in which the applicable method was being called repeatedly for every test execution. But we wouldn't have any issue with adding it back, taking into account the changes to considering a policy as applicable I mentioned in the previous comment. I'll do that then

if (retryAnalyzer instanceof RetryAnalyzer) {
// If DD's retry analyzer is used, create the execution history and report the test result
// beforehand to the tracing listener to align execution ordering with other frameworks.
// Also avoids TestNG marking retried tests as skipped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a small note about the idempotency of invoking the listener multiple times, i.e. explaining that the framework will call our listener again after the retrier is invoked, and why this second call is going to be a no-op

@daniel-mohedano
Copy link
Contributor Author

daniel-mohedano commented May 13, 2025

Do I understand correctly, that Test NG was the only framework where we found problems, which is why test fixtures for the other instrumentations are not being updated?

Exactly, TestNG was the only framework which wasn't working as intended, and incorrectly tagging tests. The changes to the other frameworks were mainly done to simplify their logic and align them with the new ordering of retry actions.

@datadog-datadog-prod-us1
Copy link
Contributor

Datadog Summary

✅ Code Quality    ✅ Code Security


Was this helpful? Give us feedback!

@daniel-mohedano daniel-mohedano marked this pull request as ready for review May 14, 2025 13:26
@daniel-mohedano daniel-mohedano requested review from a team as code owners May 14, 2025 13:26
@daniel-mohedano daniel-mohedano merged commit 7c80dbe into master May 14, 2025
591 of 592 checks passed
@daniel-mohedano daniel-mohedano deleted the daniel.mohedano/unify-retry-detection branch May 14, 2025 13:27
@github-actions github-actions bot added this to the 1.50.0 milestone May 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: ci visibility Continuous Integration Visibility type: refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants