ci: Add test retry logic for flaky tests #9218

dplewis · 2024-07-17T19:50:35Z

Pull Request

Report security issues confidentially.
Any contribution is under this license.
Link this pull request to an issue.

Issue

Flaky tests require a lot more effort to merge PRs. Test would have to be re-ran until the flaky tests passes or they could be ignored if you know which tests are flaky. Ideally every flaky test should be fixed as they are found which hasn't been the case.

Closes: #8654

Approach

Add the ability to retry well known flaky tests (tests that randomly fail)
Identify test by name
Retries 3 times if flaky test fails
Remove function scoping this from test suite RegexVulnerabilities.spec

parse-github-assistant · 2024-07-17T19:50:38Z

Thanks for opening this pull request!

codecov · 2024-07-17T20:35:54Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.76%. Comparing base (453a987) to head (e6b6fd1).
Report is 15 commits behind head on alpha.

Additional details and impacted files

@@           Coverage Diff           @@
##            alpha    #9218   +/-   ##
=======================================
  Coverage   93.76%   93.76%           
=======================================
  Files         184      184           
  Lines       14715    14715           
=======================================
  Hits        13797    13797           
  Misses        918      918

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This reverts commit f221852.

mtrezza

Could you explain how this works?

dplewis · 2024-07-17T23:36:12Z

Override the Jasmine Spec class, replace the original test function with retrying the original test function

mtrezza · 2024-07-17T23:41:06Z

How are flaky tests identified, so they get fixed at some point? What's the purpose of flakyTest.json?

What's the implication of "Identify test by name"? What happens if 2 tests have the same name?

dplewis · 2024-07-17T23:46:26Z

How are flaky tests identified, so they get fixed at some point? What's the purpose of flakyTest.json?

From this point forward any test that randomly fails for no reason is a flaky test. If you find such a test add it to flakyTest.json

What's the implication of "Identify test by name"? What happens if 2 tests have the same name?

If any test with that name fails will get retried. The only implication is if the name gets changed. If the name gets changed hopefully they are fixing the flaky test.

I could set it up to auto retry any failed test. The only problem with that is local development but I can set it up to retry on CI only.

@mtrezza You are the main reviewer so whats easier for you? Do you currently keep track of randomly failing tests or just rerun everytime?

mtrezza · 2024-07-18T01:00:23Z

If we undertake a targeted effort to fix flaky tests we track them via an issue, like we did in #7180. There is currently no such open issue where we actively track them, but I don't think that's necessary as we have significantly less flaky tests today than we had back then - also thanks to your efforts.

Why don't we use test IDs to identify flaky tests? If we see a test is flaky, we assign an ID via it_id and then add it to the flaky list, similar to how the test exclusion list works.

dplewis · 2024-07-18T01:04:00Z

Why don't we use test IDs to identify flaky tests? If we see a test is flaky, we assign an ID via it_id and then add it to the flaky list, similar to how the #8714 works.

I tried that already. I can't get the id as it happens before the test spec is ran and there is no way to pass it through.

mtrezza · 2024-07-18T01:08:08Z

For example, in https://github.com/parse-community/parse-server/actions/runs/9983595053/job/27591447103 I would identify this flaky test:

ParseLiveQuery handle invalid websocket payload length

Error: Timeout - Async function did not complete within 200000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL)

So I'll assign ID 4ccc9508-ae6a-46ec-932a-9f5e49ab3b9e to it in #9205.

And I would add that ID to the flaky list.json, right?

-- Edit, just read your previous comment; that's too bad. Maybe we can modify the it_id in the future so that the ID becomes accessible. And we use the names for now. I just don't think we will / want / can enforce that names won't change, because of typos, rephrasing, reorganizing, etc, so that's a weak spot we'd accept for now.

dplewis · 2024-07-18T01:15:06Z

You would add ParseLiveQuery handle invalid websocket payload length to the list also why is that timeout so long? 200000ms?

mtrezza · 2024-07-18T01:19:47Z

It's idempotency where we simulate a TTL index:

/** Enable TTL expiration simulated by removing entry instead of waiting for MongoDB TTL monitor which
runs only every 60s, so it can take up to 119s until entry removal - ain't nobody got time for that */

Haven't looked at this for a while but I vaguely remember that I've set it to that value on purpose; it has something to do with the comment above.

mtrezza · 2024-07-18T01:30:04Z

At what point should we turn on the test randomizer again?

spec/support/flakyTests.json

spec/support/CurrentSpecReporter.js

dplewis · 2024-07-18T02:30:03Z

At what point should we turn on the test randomizer again?

Jasmine 5.0.0 released Running Spec in Parallel that would require randomizer to be turned on. To upgrade Jasmine we would have to give the test suite some TLC as we are mixing async and done()

mtrezza

Looks good

mtrezza · 2024-07-18T02:34:24Z

But we had the randomizer turned off for flakiness investigation, right? So if we now fixed a lot of these flaky tests, plus we have this retry logic, shouldn't we turn it back on?

dplewis · 2024-07-18T02:37:47Z

Yeah we can turn it back on in a separate PR.

mtrezza · 2024-07-18T08:00:52Z

Isn't turning on the randomizer part of this retry strategy? It would also be interesting to see how this tool behaves with the randomizer turned on.

Before we merge this we should find a process for how to use this tool, what's the criteria for adding a test to the list and how to deal with flaky tests once they are obscured by this tool. For example, there is flakiness that isn't related to a specific test but is a result of previous tests. So we may end up adding more and more tests to the flaky list without really solving anything.

Signed-off-by: Manuel <[email protected]>

mtrezza · 2024-07-18T21:57:25Z

Is there any way of knowing, how many retries are required for the flaky tests to pass? Some stats would be interesting to know how efficient the approach is.

dplewis · 2024-07-19T23:05:22Z

@mtrezza It now outputs how many times a flaky test retried to pass

mtrezza · 2024-07-20T11:31:07Z

Looks good to me. As I mentioned earlier, I'm still somewhat unsure how this tool is intended to be used. A flaky test may well indicate a bug and not just be a test issue. So we'd need to define when to add a test to the list.

Adding a test to the list for retrying only makes sense if we work on fixing them once they are added, otherwise we are effectively weakening our CI quality. The annoyance of a failing CI is what drove fixing flaky tests in the past. This tool creates the convenience of hiding the flakiness, but that may undermine the drive to fix it.

So roughly, the rules could be:

A test is flaky if it sometimes fails on our (pre-)release branches, not in a PR branch, as the flakiness may be a result of the changes in the PR that is still under development.
After adding a test to the flaky list, a GH issues needs to be opened to fix the test so that it gets removed from the list as soon as possible.

mtrezza · 2024-07-21T15:27:04Z

@dplewis how should we treat this test issue?

https://github.com/parse-community/parse-server/actions/runs/10028582673/job/27715676644#step:8:1

It seems that after shutdown LiveQuery the subsequent tests all failed. This looks like an issue with the test logic, so we wouldn't add anything to the flaky list, right?

parseplatformorg · 2024-08-27T15:10:39Z

🎉 This change has been released in version 7.3.0-alpha.7

parseplatformorg · 2024-10-03T19:40:28Z

🎉 This change has been released in version 7.3.0-beta.1

parseplatformorg · 2024-10-03T20:08:38Z

🎉 This change has been released in version 7.3.0

feat: Add test retry logic for flaky tests

2ca702f

dplewis changed the title ~~feat: Add test retry logic for flaky tests~~ ci: Add test retry logic for flaky tests Jul 17, 2024

update tests with es6 logic

dc172d6

dplewis added 4 commits July 17, 2024 15:39

bump jasmine to 5.1.0

f221852

clean up

ad63776

Revert "bump jasmine to 5.1.0"

d70b5a0

This reverts commit f221852.

lint

f330c33

dplewis requested a review from a team July 17, 2024 21:41

mtrezza reviewed Jul 17, 2024

View reviewed changes

Merge branch 'alpha' into flaky-retry

ed48bf1

remove idempotency ttl timeout

df7feee

mtrezza reviewed Jul 18, 2024

View reviewed changes

spec/support/flakyTests.json Outdated Show resolved Hide resolved

mtrezza reviewed Jul 18, 2024

View reviewed changes

spec/support/CurrentSpecReporter.js Outdated Show resolved Hide resolved

dplewis added 2 commits July 17, 2024 21:20

remove unnecessary files

75c2a2c

Merge branch 'alpha' into flaky-retry

3d4e10c

mtrezza previously approved these changes Jul 18, 2024

View reviewed changes

Update jasmine.json

f8db8b6

dplewis dismissed mtrezza’s stale review via f8db8b6 July 18, 2024 13:43

mtrezza and others added 3 commits July 18, 2024 15:45

Merge branch 'alpha' into flaky-retry

50d89b5

Signed-off-by: Manuel <[email protected]>

add more flaky tests

db98a52

improve logging and increase number of retries

0f12927

dplewis added 3 commits July 19, 2024 14:12

another test

3e682dc

show number of times a failed test retried

dc5579d

Merge branch 'alpha' into flaky-retry

755e5e7

dplewis and others added 3 commits July 19, 2024 18:13

add email verification flaky tests

24636dc

Merge branch 'alpha' into flaky-retry

c2f4694

another email verification flaky test

4066089

dplewis and others added 2 commits July 20, 2024 09:00

Remove fixed tests

9e9b6f4

Merge branch 'alpha' into flaky-retry

66bf11e

mtrezza added 3 commits July 26, 2024 17:42

Merge branch 'alpha' into flaky-retry

f06874a

Merge branch 'alpha' into flaky-retry

719f80d

Update CurrentSpecReporter.js

e6b6fd1

mtrezza merged commit 9fd7070 into parse-community:alpha Aug 11, 2024

parseplatformorg added the state:released-alpha Released as alpha version label Aug 27, 2024

parseplatformorg added the state:released-beta Released as beta version label Oct 3, 2024

parseplatformorg added the state:released Released as stable version label Oct 3, 2024

Uh oh!

ci: Add test retry logic for flaky tests #9218

ci: Add test retry logic for flaky tests #9218

Uh oh!

Conversation

dplewis commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Issue

Approach

Uh oh!

parse-github-assistant bot commented Jul 17, 2024

Thanks for opening this pull request!

Uh oh!

codecov bot commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mtrezza left a comment

Choose a reason for hiding this comment

Uh oh!

dplewis commented Jul 17, 2024

Uh oh!

mtrezza commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dplewis commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mtrezza commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dplewis commented Jul 18, 2024

Uh oh!

mtrezza commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dplewis commented Jul 18, 2024

Uh oh!

mtrezza commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mtrezza commented Jul 18, 2024

Uh oh!

Uh oh!

Uh oh!

dplewis commented Jul 18, 2024

Uh oh!

mtrezza left a comment

Choose a reason for hiding this comment

Uh oh!

mtrezza commented Jul 18, 2024

Uh oh!

dplewis commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mtrezza commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mtrezza commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dplewis commented Jul 19, 2024

Uh oh!

mtrezza commented Jul 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mtrezza commented Jul 21, 2024

Uh oh!

parseplatformorg commented Aug 27, 2024

Uh oh!

parseplatformorg commented Oct 3, 2024

Uh oh!

parseplatformorg commented Oct 3, 2024

Uh oh!

Uh oh!

dplewis commented Jul 17, 2024 •

edited

Loading

codecov bot commented Jul 17, 2024 •

edited

Loading

mtrezza commented Jul 17, 2024 •

edited

Loading

dplewis commented Jul 17, 2024 •

edited

Loading

mtrezza commented Jul 18, 2024 •

edited

Loading

mtrezza commented Jul 18, 2024 •

edited

Loading

mtrezza commented Jul 18, 2024 •

edited

Loading

dplewis commented Jul 18, 2024 •

edited

Loading

mtrezza commented Jul 18, 2024 •

edited

Loading

mtrezza commented Jul 18, 2024 •

edited

Loading

mtrezza commented Jul 20, 2024 •

edited

Loading