-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Stream stops reading from kernel halfway through in http.IncomingMessage #7910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
v0.10 is pretty close to its end of life, can you verify that this is still a problem with Node v4.x or v6.x? |
I can't - upgrading isn't really an option presently and unfortunately, I cannot reproduce locally :/ |
Is upgrading to v0.10.46 an option for you? v0.10.36 is already a year and a half old.
What does 'remain stuck' mean? It's not unusual for sockets to stay in CLOSE_WAIT for a few minutes. Do you see sockets in TIME_WAIT state? |
Upgrading to v0.10.46 is an option for us. I see no sockets in TIME_WAIT. Regarding CLOSE_WAIT, just now for example, by cross referencing netstat with application logs, I see a socket that's been in CLOSE_WAIT for over 3 hours. In tcpdump, I'll see a FIN being sent from the remote host, but we never send our own FIN. We also just stop reading from the buffer; I see Any ideas where to look next? I absolutely cannot reproduce locally, but this happens in production all the time. My next idea is to add some custom logging to net.js. |
I'm finally able to reproduce locally. This may just actually be an application level issue, not a node issue; I"ll update when I find out more. |
Ended up being an application level issue. There was a stream that we were not consuming, so everything makes sense now. |
v0.10.36
Linux hd1app1 3.13.0-83-generic src: fix unaligned access in ucs2 string encoder #127-Ubuntu SMP Fri Mar 11 00:25:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
net.js, http.js, _stream_readable.js
We've been investigating a memory leak issue where we have sockets that remain stuck in
CLOSE_WAIT
. The sockets are being using to pull blob files (~ 250KB - 1MB in size) from Amazon S3. The code we use to pull the data is using the Knox library [https://github.com/Automattic/knox], but it's really just a wrapper around http.ClientRequest.The code is straightforward and essentially boils down to:
I cannot reproduce this issue locally, but in production I'll have sockets stuck in this state even 10 minutes after a restart. Also, what I thought was just a memory leak appears to result in us silently not completing client request for the s3 data.
When turning on debug logging of the
net.js
module and cross referencing it against a tcpdump I was able to find thathandle.readStop()
is being called [https://github.com/nodejs/node/blob/v0.10.29-release/lib/net.js#L533] during the data transfer. After this, we never end up reading from that socket any further. The amount of data left in the kernel's Recv-Q for that socket (vianetstat
) is equal to the remainder shown in the tcpdump output. That is, the amount of data that node processed (calculated via https://github.com/nodejs/node/blob/v0.10.29-release/lib/net.js#L504) plus the remainder in the kernel equals the total sent from the remote host.readStop
is being called, butreadStart
is not, given that there is still data to read?data
events to using thereadable
event with theread()
method (streams2); would that even make any difference?highWaterMark
to getreadStop()
to fire, but even ifreadStop
gets called in these tests, the stream always ends up resuming. I've tried hitting S3 with low and high load but still cannot reproduce. Any suggestions?Thanks,
Dave
The text was updated successfully, but these errors were encountered: