-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/go: ScriptTest consistently timing out on dragonfly-amd64-5_8 builder #38797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
|
looks similar to #38824 |
@jirfag, could you expand on that a bit? The goroutine dump linked above doesn't seem to show anything stuck in |
you are right, sorry, I've mislooked |
This builder seems to be fairly slow overall. I logged in using I made some tweaks to the However, the builder does not seem to have @tuxillo, is there a reason this builder should be particularly slow, or something else about it that would cause it to run long tests instead of short ones? @andybons, is there some way to confirm that the tests are indeed being run with |
Change https://golang.org/cl/233526 mentions this issue: |
@bcmills that is strange indeed. The builder is running as a VM in a Linux host, the same host as the other dragonfly builder. |
… times out - Avoid starting subprocesses when the test is already very close to timing out. The overhead of starting and stopping processes may cause the test to exceed its deadline even if each individual process is signaled soon after it is started. - If a command does not shut down quickly enough after receiving os.Interrupt, send it os.Kill using the same style of grace period as in CL 228438. - Fail the test if a background command whose exit status is not ignored is left running at the end of the test. We have no reliable way to distinguish a failure due to the termination signal from an unexpected failure, and the termination signal varies across platforms (so may cause failure on one platform but success on another). For #38797 Change-Id: I767898cf551dca45579bf01a9d1bb312e12d6193 Reviewed-on: https://go-review.googlesource.com/c/go/+/233526 Run-TryBot: Bryan C. Mills <[email protected]> Reviewed-by: Jay Conrod <[email protected]>
This symptom is no longer occurring, although I don't know what would have fixed it. (My CL was only improving the diagnostics on failure.) I still suspect that there may be something wrong with this builder, but I don't know what and don't plan to investigate further. We can refer back to this issue if there is another regression on this builder. Since the specific symptom reported in this issue is no longer present, closing as non-reproducible. |
@bcmills Thanks a lot for taking care of this. If the builder keeps failing again, I'll reinstall it. |
As of CL 231223, the
dragonfly-amd64-5_8
builder (and only that builder) is consistently timing out on thecmd/go
tests, during one of the script tests:https://build.golang.org/log/fb00e74ea4d50113498d60db4d6b30c09ee0a4ea
The builder goes unresponsive enough that the test's usual timeout behavior doesn't halt the test in time, so all we get is a goroutine dump from the test process (which does not reveal the source of the hang).
I tried to use
gomote ssh
to investigate, but it failed due to a configuration error (#38796):I don't see how CL 231223 could be causing deadlocks, since it is mostly a straight refactor, but I tried a revert (in CL 231557) and it passed as a SlowBot. So I'm not sure what to do about that: I hate to roll back based on a seemingly-unrelated failure on a builder I can't even access, especially given that the mainline
dragonfly-amd64
builder is still passing.If I could at least figure out which test is deadlocking, I could add a
skip
for that test on the theory that it's likely a bad interaction with a kernel bug...CC @tuxillo @andybons
The text was updated successfully, but these errors were encountered: