Skip to content

Mix unit tests are flaky #14463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wejn opened this issue May 2, 2025 · 4 comments
Closed

Mix unit tests are flaky #14463

wejn opened this issue May 2, 2025 · 4 comments

Comments

@wejn
Copy link
Contributor

wejn commented May 2, 2025

Elixir and Erlang/OTP versions

Erlang 26.2.5.11-r0
Elixir 1.18.3

Operating system

Linux (alpine linux CI)

Current behavior

The mix test intermittently fails with:

==> mix (ex_unit)
Running ExUnit with seed: 791439, max_cases: 96
Excluding tags: [windows: true]
  1) test delivers broadcast to subscribers for different keys (Mix.Sync.PubSubTest)
     /builds/wejn/aports/community/elixir/src/elixir-1.18.3/lib/mix/test/mix/sync/pubsub_test.exs:39
     Assertion failed, no matching message after 100ms
     The process mailbox is empty.
     code: assert_receive :subscribed1
     stacktrace:
       test/mix/sync/pubsub_test.exs:60: (test)
  2) test lock can be acquired multiple times by the same process (Mix.Sync.LockTest)
     /builds/wejn/aports/community/elixir/src/elixir-1.18.3/lib/mix/test/mix/sync/lock_test.exs:137
     Assertion failed, no matching message after 100ms
     The following variables were pinned:
       ref = #Reference<0.2615586780.213909512.46567>
     The process mailbox is empty.
     code: assert_receive {:DOWN, ^ref, _, _, _}
     stacktrace:
       test/mix/sync/lock_test.exs:147: (test)

or:

  1) test listening to concurrent compilations (Mix.Tasks.CompileTest)
     test/mix/tasks/compile_test.exs:356
     ** (EXIT from #PID<0.10394.0>) an exception was raised:
          Assertion failed, no matching message after 2000ms
          The following variables were pinned:
            port = #Port<0.807>
          The process mailbox is empty.
          code: assert_receive {^port, {:data, "ok\n"}}
          stacktrace:
            (ex_unit 1.18.3) lib/ex_unit/assertions.ex:599: ExUnit.Assertions.__timeout__/5
            test/mix/tasks/compile_test.exs:411: anonymous fn/1 in Mix.Tasks.CompileTest."test listening to concurrent compilations"/1

on some platforms:

https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/83550/pipelines

(namely https://gitlab.alpinelinux.org/wejn/aports/-/pipelines/321039, https://gitlab.alpinelinux.org/wejn/aports/-/pipelines/321033, https://gitlab.alpinelinux.org/wejn/aports/-/pipelines/321031)

For now the workaround I've added keeps the intermittent failures to a minimum (as the lib/mix/test/mix/sync/*_test.exs are the main offenders).

Expected behavior

No intermittent failures, even on slower systems (heavily loaded CI).

@josevalim
Copy link
Member

On main you can set ELIXIR_ASSERT_TIMEOUT=1000 to give more time for those messages to be received. It is specially important when running on overloaded machines or with few resources.

@wejn
Copy link
Contributor Author

wejn commented May 2, 2025

That solves only some of the problems:

  1) test recompiles files when lock changes (Mix.Tasks.Compile.ElixirTest)
     test/mix/tasks/compile.elixir_test.exs:453
     Assertion with == failed
     code:  assert recompile.() == {:ok, []}
     left:  {:noop, []}
     right: {:ok, []}
     stacktrace:
       test/mix/tasks/compile.elixir_test.exs:494: anonymous fn/0 in Mix.Tasks.Compile.ElixirTest."test recompiles files when lock changes"/1
       (elixir 1.18.3) lib/file.ex:1665: File.cd!/2
       test/test_helper.exs:169: MixTest.Case.in_fixture/3
       test/mix/tasks/compile.elixir_test.exs:454: (test)

(via https://gitlab.alpinelinux.org/wejn/aports/-/jobs/1832574)

But looks like I don't have perms to reopen this issue...

@wejn
Copy link
Contributor Author

wejn commented May 2, 2025

And:

  1) test in/2 too large list in guards (KernelTest)
     test/elixir/kernel_test.exs:485
     ** (ExUnit.TimeoutError) test timed out after 60000ms. You can change the timeout:
       1. per test by setting "@tag timeout: x" (accepts :infinity)
       2. per test module by setting "@moduletag timeout: x" (accepts :infinity)
       3. globally via "ExUnit.start(timeout: x)" configuration
       4. by running "mix test --timeout x" which sets timeout
       5. or by running "mix test --trace" which sets timeout to infinity
          (useful when using IEx.pry/0)
     where "x" is the timeout given as integer in milliseconds (defaults to 60_000).
     
     code: defmodule TooLargeList do
     stacktrace:
       (elixir 1.18.3) src/elixir_erl_compiler.erl:15: :elixir_erl_compiler.spawn/1
       (elixir 1.18.3) src/elixir_module.erl:160: :elixir_module.compile/7
       (elixir 1.18.3) src/elixir_lexical.erl:13: :elixir_lexical.run/3
       test/elixir/kernel_test.exs:486: (test)
       (ex_unit 1.18.3) lib/ex_unit/runner.ex:511: ExUnit.Runner.exec_test/2
       (stdlib 5.2.3.3) timer.erl:270: :timer.tc/2
       (ex_unit 1.18.3) lib/ex_unit/runner.ex:433: anonymous fn/6 in ExUnit.Runner.spawn_test_monitor/4

(via https://gitlab.alpinelinux.org/wejn/aports/-/jobs/1832578)

@josevalim
Copy link
Member

I was going to reopen it but you opening a new one also works. I will look into those two :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants