Skip to content

Flaky test: TestQueueConcurrency triggered race condition #6109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yeya24 opened this issue Jul 22, 2024 · 3 comments · Fixed by #6160
Closed

Flaky test: TestQueueConcurrency triggered race condition #6109

yeya24 opened this issue Jul 22, 2024 · 3 comments · Fixed by #6160

Comments

@yeya24
Copy link
Contributor

yeya24 commented Jul 22, 2024

Describe the bug

https://github.com/cortexproject/cortex/actions/runs/10049567917/job/27775854567?pr=6096#step:6:139

==================
WARNING: DATA RACE
Read at 0x00c00032e6a0 by goroutine 61:
  github.com/cortexproject/cortex/pkg/scheduler/queue.(*queues).getNextQueueForQuerier()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues.go:234 +0x9c
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.func1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:482 +0x1a4
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.gowrap1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:488 +0x41

Previous write at 0x00c00032e6a0 by goroutine 66:
  github.com/cortexproject/cortex/pkg/scheduler/queue.(*queues).deleteQueue()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues.go:121 +0x24a
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.func1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:486 +0x12a
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.gowrap1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:488 +0x41

Goroutine 61 (running) created at:
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:477 +0x337
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1742 +0x44

Goroutine 66 (finished) created at:
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:477 +0x337
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1742 +0x44
==================
--- FAIL: TestQueueConcurrency (0.01s)
    testing.go:1398: race detected during execution of test
FAIL
FAIL	github.com/cortexproject/cortex/pkg/scheduler/queue	10.226s
@yeya24
Copy link
Contributor Author

yeya24 commented Jul 24, 2024

This seems another occurance https://github.com/cortexproject/cortex/actions/runs/10070390850/job/27838844384.

panic: test timed out after 30m0s
running tests:
	TestQueueConcurrency (29m50s)

goroutine 48 [running]:
testing.(*M).startAlarm.func1()
	/usr/local/go/src/testing/testing.go:2366 +0x265
created by time.goFunc
	/usr/local/go/src/time/sleep.go:177 +0x45

goroutine 1 [chan receive, 29 minutes]:
testing.(*T).Run(0xc0000c0b60, {0x10c995c, 0x14}, 0x110e748)
	/usr/local/go/src/testing/testing.go:1750 +0x851
testing.runTests.func1(0xc0000c0b60)
	/usr/local/go/src/testing/testing.go:2161 +0x86
testing.tRunner(0xc0000c0b60, 0xc000[297](https://github.com/cortexproject/cortex/actions/runs/10070390850/job/27838844384?pr=6114#step:6:298)b10)
	/usr/local/go/src/testing/testing.go:1689 +0x21f
testing.runTests(0xc0000a8cc0, {0x17a23a0, 0x11, 0x11}, {0xc000297bb8?, 0xc000297c00?, 0x17b0f00?})
	/usr/local/go/src/testing/testing.go:2159 +0x8bf
testing.(*M).Run(0xc0000d6b40)
	/usr/local/go/src/testing/testing.go:2027 +0xf18
main.main()
	_testmain.go:87 +0x2be

goroutine 66 [semacquire, 29 minutes]:
sync.runtime_Semacquire(0xc00018ee58?)
	/usr/local/go/src/runtime/sema.go:62 +0x25
sync.(*WaitGroup).Wait(0xc00018ee50)
	/usr/local/go/src/sync/waitgroup.go:116 +0xa5
github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency(0xc0003c4000?)
	/__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:491 +0x475
testing.tRunner(0xc0003c4000, 0x110e748)
	/usr/local/go/src/testing/testing.go:1689 +0x21f
created by testing.(*T).Run in goroutine 1
	/usr/local/go/src/testing/testing.go:1742 +0x826

goroutine 84 [chan receive, 29 minutes]:
github.com/cortexproject/cortex/pkg/scheduler/queue.(*FIFORequestQueue).dequeueRequest(0xc000050020, 0x10bccf1?, 0x6?)
	/__w/cortex/cortex/pkg/scheduler/queue/user_request_queue.go:35 +0x5f
github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.func1(0xf)
	/__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:484 +0x13a
created by github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency in goroutine 66
	/__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:477 +0x338
FAIL	github.com/cortexproject/cortex/pkg/scheduler/queue	1800.136s

@justinjung04
Copy link
Contributor

It seems to be related to my change. I'll take a look

@yeya24
Copy link
Contributor Author

yeya24 commented Aug 13, 2024

It seems a little bit concerning as the last goroutine got panic. https://github.com/cortexproject/cortex/actions/runs/10362510413/job/28684608492

==================
WARNING: DATA RACE
Write at 0x00c000590000 by goroutine 52:
  github.com/cortexproject/cortex/pkg/scheduler/queue.(*queues).deleteQueue()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues.go:114 +0x1c4
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.func1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:486 +0x12a
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.gowrap1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:488 +0x41

Previous read at 0x00c000590000 by goroutine 51:
  github.com/cortexproject/cortex/pkg/scheduler/queue.(*queues).getNextQueueForQuerier()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues.go:234 +0x129
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.func1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:482 +0x1a4
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.gowrap1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:488 +0x41

Goroutine 52 (running) created at:
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:477 +0x324
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1742 +0x44

Goroutine 51 (running) created at:
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:477 +0x324
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1742 +0x44
==================
==================
WARNING: DATA RACE
Write at 0x00c00040bea0 by goroutine 52:
  github.com/cortexproject/cortex/pkg/scheduler/queue.(*queues).deleteQueue()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues.go:118 +0x24a
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.func1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:486 +0x12a
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.gowrap1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:488 +0x41

Previous read at 0x00c00040bea0 by goroutine 51:
  github.com/cortexproject/cortex/pkg/scheduler/queue.(*queues).getNextQueueForQuerier()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues.go:225 +0x9c
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.func1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:482 +0x1a4
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.gowrap1()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:488 +0x41

Goroutine 52 (running) created at:
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:477 +0x324
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1742 +0x44

Goroutine 51 (running) created at:
  github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency()
      /__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:477 +0x324
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1742 +0x44
==================
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xf73926]

goroutine 47 [running]:
github.com/cortexproject/cortex/pkg/scheduler/queue.(*queues).getNextQueueForQuerier(0xc00040be80, 0x0, {0x[111](https://github.com/cortexproject/cortex/actions/runs/10362510413/job/28684608492#step:6:112)a931, 0x3})
	/__w/cortex/cortex/pkg/scheduler/queue/user_queues.go:244 +0x246
github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency.func1(0x6)
	/__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:482 +0x1a5
created by github.com/cortexproject/cortex/pkg/scheduler/queue.TestQueueConcurrency in goroutine 40
	/__w/cortex/cortex/pkg/scheduler/queue/user_queues_test.go:477 +0x325
FAIL	github.com/cortexproject/cortex/pkg/scheduler/queue	9.537s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants