Skip to content

Elixir tests are flaky, even with ELIXIR_ASSERT_TIMEOUT=1000 #14464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wejn opened this issue May 2, 2025 · 5 comments
Closed

Elixir tests are flaky, even with ELIXIR_ASSERT_TIMEOUT=1000 #14464

wejn opened this issue May 2, 2025 · 5 comments

Comments

@wejn
Copy link
Contributor

wejn commented May 2, 2025

Elixir and Erlang/OTP versions

Erlang 26.2.5.11-r0
Elixir 1.18.3

Operating system

Linux (alpine linux CI)

Current behavior

After applying the fix from #14463 , I'm still getting flaky tests:

  1) test recompiles files when lock changes (Mix.Tasks.Compile.ElixirTest)
     test/mix/tasks/compile.elixir_test.exs:453
     Assertion with == failed
     code:  assert recompile.() == {:ok, []}
     left:  {:noop, []}
     right: {:ok, []}
     stacktrace:
       test/mix/tasks/compile.elixir_test.exs:506: anonymous fn/0 in Mix.Tasks.Compile.ElixirTest."test recompiles files when lock changes"/1
       (elixir 1.18.3) lib/file.ex:1665: File.cd!/2
       test/test_helper.exs:169: MixTest.Case.in_fixture/3
       test/mix/tasks/compile.elixir_test.exs:454: (test)

(via https://gitlab.alpinelinux.org/wejn/aports/-/jobs/1832588)

And:

  1) test in/2 too large list in guards (KernelTest)
     test/elixir/kernel_test.exs:485
     ** (ExUnit.TimeoutError) test timed out after 60000ms. You can change the timeout:
       1. per test by setting "@tag timeout: x" (accepts :infinity)
       2. per test module by setting "@moduletag timeout: x" (accepts :infinity)
       3. globally via "ExUnit.start(timeout: x)" configuration
       4. by running "mix test --timeout x" which sets timeout
       5. or by running "mix test --trace" which sets timeout to infinity
          (useful when using IEx.pry/0)
     where "x" is the timeout given as integer in milliseconds (defaults to 60_000).
     
     code: defmodule TooLargeList do
     stacktrace:
       (elixir 1.18.3) src/elixir_erl_compiler.erl:15: :elixir_erl_compiler.spawn/1
       (elixir 1.18.3) src/elixir_module.erl:160: :elixir_module.compile/7
       (elixir 1.18.3) src/elixir_lexical.erl:13: :elixir_lexical.run/3
       test/elixir/kernel_test.exs:486: (test)
       (ex_unit 1.18.3) lib/ex_unit/runner.ex:511: ExUnit.Runner.exec_test/2
       (stdlib 5.2.3.3) timer.erl:270: :timer.tc/2
       (ex_unit 1.18.3) lib/ex_unit/runner.ex:433: anonymous fn/6 in ExUnit.Runner.spawn_test_monitor/4

(via https://gitlab.alpinelinux.org/wejn/aports/-/jobs/1832578)

And:

==> mix (ex_unit)
Running ExUnit with seed: 178593, max_cases: 128
Excluding tags: [windows: true]
.
  1) test releases lock on exit (Mix.Sync.LockTest)
     /builds/wejn/aports/community/elixir/src/elixir-1.18.3/lib/mix/test/mix/sync/lock_test.exs:23
     Assertion failed, no matching message after 100ms
     The following variables were pinned:
       ref = #Reference<0.4245923222.1459093535.135810>
     The process mailbox is empty.
     code: assert_receive {:DOWN, ^ref, _, _, _}
     stacktrace:
       test/mix/sync/lock_test.exs:29: (test)
........
  2) test delivers broadcast to subscribers for different keys (Mix.Sync.PubSubTest)
     test/mix/sync/pubsub_test.exs:39
     Assertion failed, no matching message after 100ms
     The process mailbox is empty.
     code: assert_receive :subscribed2
     stacktrace:
       test/mix/sync/pubsub_test.exs:61: (test)
.....
08:29:46.458 [error] Process #PID<0.758.0> raised an exception
** (ExUnit.AssertionError) 
Assertion failed, no matching message after 100ms
     The process mailbox is empty.
code: assert_receive %{event: "event1"}
    (ex_unit 1.18.3) lib/ex_unit/assertions.ex:599: ExUnit.Assertions.__timeout__/5
    test/mix/sync/pubsub_test.exs:45: anonymous fn/1 in Mix.Sync.PubSubTest."test delivers broadcast to subscribers for different keys"/1

(via https://gitlab.alpinelinux.org/wejn/aports/-/jobs/1832591)

Even bumping the ELIXIR_ASSERT_TIMEOUT to 10k (instead of the proposed 1k) doesn't fix the issue.

Expected behavior

No intermittent failures.

@josevalim
Copy link
Member

Can you try bumping the machine specs to see if it yields better results? Many of those are testing concurrency features or high compilation times, which would explain the failures.

@josevalim
Copy link
Member

Oh, it seems you are using the 1.18.3 branch still. The fix I mentioned was applied to main.

@wejn
Copy link
Contributor Author

wejn commented May 2, 2025

Unfortunately I have no control over the machine specs. :-(

As for the fix on main: Hm, any way I can cherrypick it to 1.18.3? Otherwise I can wait until the next release (that has this fix) lands...

@josevalim
Copy link
Member

You can try backporting it: 920a6a0

Or manually changing those configurations. But I can fix the remaining ones you identified and then we can reopen if it persists on v1.19.0+. :)

@wejn
Copy link
Contributor Author

wejn commented May 2, 2025

Sounds like a plan :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants