Skip to content

Stack overflow on large doctests in edition 2024 on Windows #138248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ehuss opened this issue Mar 9, 2025 · 5 comments · Fixed by #138281
Closed

Stack overflow on large doctests in edition 2024 on Windows #138248

ehuss opened this issue Mar 9, 2025 · 5 comments · Fixed by #138281
Assignees
Labels
A-doctests Area: Documentation tests, run by rustdoc A-edition-2024 Area: The 2024 edition C-bug Category: This is a bug. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@ehuss
Copy link
Contributor

ehuss commented Mar 9, 2025

Over in #138162 I am trying to update the standard library to Rust 2024. I'm running into a problem where core's doctests are failing on x86_64-pc-windows-msvc.

The behavior is:

  1. Rustdoc prints thread 'main' has overflowed its stack
  2. It then starts running the tests: running 1672 tests
  3. After they are all done, it prints test result: ok. 1671 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 109.68s
  4. After all the tests complete, it exits 101.

From what I can tell, there is a const TESTS: [test::TestDescAndFn; 3641] array in the combined test code. When calling test::test_main, it fails somewhere in there. I'm struggling a bit figuring out a good way to debug exactly where the stack is overflowing.

There are a few issues here:

  1. Obviously it shouldn't overflow its stack.
  2. The "has overflowed its stack" message has no context, and doesn't explain where it is coming from (it is coming from executing the combined suite, but for some reason I guess rustdoc continues to run the uncombined tests?).
  3. The error is at the beginning of the output, but there is no summary at the end to explain what has happened (like the normal test runner would do if an individual test would fail). In fact, it prints a summary that says "everything was okay" when it was not.

I haven't put together an isolated repro. I wonder if just creating something with 4,000 doctests would suffice? So far, I've been using --persist-doctests to get a copy of core's combined suite, and then running rustc directly on that.

cc @notriddle @GuillaumeGomez

@ehuss ehuss added A-doctests Area: Documentation tests, run by rustdoc A-edition-2024 Area: The 2024 edition C-bug Category: This is a bug. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Mar 9, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 9, 2025
@saethlin
Copy link
Member

saethlin commented Mar 9, 2025

From what I can tell, there is a const TESTS: [test::TestDescAndFn; 3641] array in the combined test code. When calling test::test_main, it fails somewhere in there. I'm struggling a bit figuring out a good way to debug exactly where the stack is overflowing.

Just by squinting at the code, I would bet on this line, which is passing that array by value:

return std::process::Termination::report(test::test_main(test_args, Vec::from(TESTS), None));

I'd at least try passing &TESTS, you might also need to turn that const into a static.

Oh actually there's another Vec::from(TESTS) in that function. Ouch. I strongly suspect that copies of this array are what's causing the stack size explosion.

@fmease fmease removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 9, 2025
@saethlin
Copy link
Member

saethlin commented Mar 9, 2025

I thought I would be able to reproduce this crash on Linux by shrinking my stack size. No luck. I can indeed get stack overflows when building the core doctests by making my stack size 1 MB, but also that stack size is too small to build the compiler so that can't be it.

So I think this is somehow actually Windows-specific, or it depends on the very specific CI build configuration.

@ChrisDenton
Copy link
Member

ChrisDenton commented Mar 9, 2025

For MSVC you increase the default stack limit by adding:

compiler.arg(format!("-Clink_arg=/STACK:{}", 8 * 1024 * 1024));

Somewhere in

fn run_test(

Or use --stack,{} for mingw.

This could be justified for rustdoc as making the test framework more consistent cross-platform. Though fixing the underlying reason why we use so much stack space would be a good idea.

@ChrisDenton
Copy link
Member

Alternatively it could start a thread instead of running on the main thread. That way it could control the stack size directly on all platforms. However, that may be more hazardous for compatibility if something relies on being run on the main thread for some reason (I'm not sure what reason there would be for a doc test but I guess it's possible on some platforms).

@saethlin
Copy link
Member

saethlin commented Mar 9, 2025

Ah! Of course this is the main thread so RUST_MIN_STACK doesn't do anything.

I can reproduce the problem on Linux with ulimit -s 1024.

@saethlin saethlin self-assigned this Mar 9, 2025
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Mar 10, 2025
…ize, r=GuillaumeGomez

Fix O(tests) stack usage in edition 2024 mergeable doctests

Fixes rust-lang#138248

The important change here is that we are not passing a potentially-large array by value. Between the fact that `TestFn` cannot be `Clone` and `test_main` takes a `Vec<TestDescAndFn>`, the only way to call `test::test_main` without O(tests) stack use is to call `Vec::push` many times.

The normal test harness does not have this problem because it calls `test_main_static` or `test_main_static_abort`, which take `&[TestDescAndFn]`. Changing `test::test_main` to take a slice is not a simple change, so I'm avoiding doing it here.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Mar 10, 2025
Rollup merge of rust-lang#138281 - saethlin:mergeable-doctests-stacksize, r=GuillaumeGomez

Fix O(tests) stack usage in edition 2024 mergeable doctests

Fixes rust-lang#138248

The important change here is that we are not passing a potentially-large array by value. Between the fact that `TestFn` cannot be `Clone` and `test_main` takes a `Vec<TestDescAndFn>`, the only way to call `test::test_main` without O(tests) stack use is to call `Vec::push` many times.

The normal test harness does not have this problem because it calls `test_main_static` or `test_main_static_abort`, which take `&[TestDescAndFn]`. Changing `test::test_main` to take a slice is not a simple change, so I'm avoiding doing it here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-doctests Area: Documentation tests, run by rustdoc A-edition-2024 Area: The 2024 edition C-bug Category: This is a bug. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
Development

Successfully merging a pull request may close this issue.

5 participants