-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Gitea stops responding on HTTP #15826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
OK The two longest blocked goroutines are:
but they're both blocked on writing to HTML and acquiring a lock in: This code worries me though as it looks like anything that holds up Rendering just blocks it for everyone and we essentially can only render one html template at a time?! The thing holding the lock therefore is:
Which is weird because that looks like it's locked in writing to the response. This is very odd - and frankly very concerning.
|
Ok we can add a write deadline (& read deadline) to the connection in here: gitea/modules/graceful/server.go Lines 202 to 231 in d86d123
Likely at line 222 would be best. Agh no it doesn't work like that! It's a deadline not a timeout that means we need to change the write function to update the timeout. Damn. The question is what the timeout should be but I suspect certainly nothing more than 5s but likely even 500ms is too much. |
In go-gitea#15826 it has become apparent that there are a few occasions when a response can hang during writing, and because there is no timeout go will happily just block interminably. This PR adds a fixed 5 second timeout to all writes to a connection. Fix go-gitea#15826 Signed-off-by: Andrew Thornton <[email protected]>
I've opened up a second issue to deal with renderlock issue because I think your real problem here is the blocked responsewriter. |
OK we've merged the unrolled/render change in #15845 preventing the renderlock problem. |
That's great! I've pulled an updated image built from ffbd0fe & am running it. What should I do with this issue? If you'd like, I will close it in a few days when it's clear it is not hitting the same problem anymore. My instance had locked up again in the 2 days since this was opened so it shouldn't be very long. |
I still think #15831 is needed - as I think you're likely to find that although the rendering pipeline no longer gets blocked there are zombie network connections. |
In #15826 it has become apparent that there are a few occasions when a response can hang during writing, and because there is no timeout go will happily just block interminably. This PR adds a fixed 5 second timeout to all writes to a connection. Fix #15826 Signed-off-by: Andrew Thornton <[email protected]>
Backport go-gitea#15831 In go-gitea#15826 it has become apparent that there are a few occasions when a response can hang during writing, and because there is no timeout go will happily just block interminably. This PR adds a fixed 5 second timeout to all writes to a connection. Fix go-gitea#15826 Signed-off-by: Andrew Thornton <[email protected]>
Backport #15831 In #15826 it has become apparent that there are a few occasions when a response can hang during writing, and because there is no timeout go will happily just block interminably. This PR adds a fixed 5 second timeout to all writes to a connection. Fix #15826 Signed-off-by: Andrew Thornton <[email protected]>
In go-gitea#15826 it has become apparent that there are a few occasions when a response can hang during writing, and because there is no timeout go will happily just block interminably. This PR adds a fixed 5 second timeout to all writes to a connection. Fix go-gitea#15826 Signed-off-by: Andrew Thornton <[email protected]>
gitea/gitea
image.[x]
):I've had this happen a few times but not yet with debug logging enabled. It's on now. Will update. I didn't see anything out-of-the-ordinary in the default log that I had.
I did however get a trace of all the goroutines: https://gist.github.com/dpedu/86b729acd51d2132950328a4040e0092
Description
I run a web facing (not directly, it's behind Nginx and Varnish) copy of Gitea that I alone am the user of. After 2 - 3 days of uptime, the Gitea instance seems to stop responding to HTTP requests. The reverse proxy in front of it will show a 504 timeout error.
I can connect to the Gitea instance directly with my browser and while it accepts my connection it never responds to http requests; the problem appears the same with the reverse proxies between Gitea and I removed.
The ssh interface still works fine - I can clone, push, pull, etc.
Looking at the log - Gitea is logging as if it is still serving normal traffic. It logs lines like
Started GET
orCompleted GET
as normal but through inspecting the traffic in my reverse proxy, it's actually not replying.Looking at the goroutine trace above, it looks like there are very many gorountines trying to acquire some lock in the template layer. More than 300! My instance was not using much CPU at all when viewed in
htop
in the bad state, it seemed like something was locked up rather than overloaded.In
ps
, there were several zombiegit
processes that were children ofgitea
. Not sure if this is a cause of a result of the other problem:Here's
lsof -p
output for Gitea as well. I snipped out the "normal" stuff like my sqlite database. What was left was about 400 of the TYPE=sock
lines, with about 10x fewer TYPE=FIFO
lines spinkled in.When I restart my gitea container everything is back to normal afterwards.
I first remember encounter this problem in mid April. I pull the latest docker image each time I've had to manually restart it, so it has appeared in more than one version.
There are other http-based services running on the same machine that don't show similar issues.
I have lots and lots of code checked into public repos so the instance does attract quite a bit of bot traffic scraping the html. Sometimes I have to user-agent ban ones that are too aggressive and push the CPU usage too much.
Screenshots
N/A
The text was updated successfully, but these errors were encountered: