-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: preemption-related deadlock when calling Go from C #35294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I ran a bunch of times at master to see how long it takes to hang. Here's a record of the number of fuzz iterations before hang: 118784 I would guess it's exponentially distributed? Running until 2M iterations seems like a safe threshold for bisection. Edit: Using a 1.5M iteration threshold because I'm impatient, and estimating the exponential distribution based on the above samples suggests to me the chance of 1.5M iterations without failure is 0.25%. (But I'm no statistician.) |
/cc @aclements Still bisecting, but it seems to be narrowing down on one of the CLs related to #10958 / #24543. |
Honestly I wouldn't be surprised if the culprit CL turns out to be https://golang.org/cl/171883, which turns on the new timers. I suspect there may be some case we are failing to handle. It would be nice (for me) to hear that it was a different CL, though. |
@ianlancetaylor You look safe for now. My current bisect interval is from 316fb95 (good) to 6058603 (bad), and that CL looks like it was merged outside of that window. |
Git bisect identified 3f83411. I double checked to make sure it does hang at that CL; and I'm up to 2M iterations without hanging for double checking the previous commit (which assuming I didn't mess up "git bisect bad" / "git bisect good" earlier should be at least 3.5M iterations in total). |
I've been able to minimize the failure below, though it seems to reproduce the issue somewhat less reliably than actually using libfuzzer. The program should run forever printing increasing numbers, but occasionally it hangs.
(Edit: See better repro below.)
|
Simpler, much more reliable repro:
|
I'm not able to make any more progress on this. The last repro is very reliable, but I don't sufficiently understand the runtime preemption logic, and my naive debugging skills aren't sufficient here. |
I found the problem. Working on adding a test. |
Change https://golang.org/cl/204957 mentions this issue: |
While comparing dvyukov/go-fuzz with mdempsky/go114-fuzz-build, I noticed that fuzzing k8s.io/kubernetes/test/fuzz/yaml.FuzzSigYaml with libFuzzer seems to periodically run into timeouts. I haven't been able to reproduce this with Go 1.13, so this seems like a regression.
Tentatively marking release blocker, since this seems like it could be a subtle runtime and/or cgo regression.
I'm going to try bisecting to see if I can figure out what commit caused the problem.
The text was updated successfully, but these errors were encountered: