Skip to content

Unable to keep bi-directal streaming connection open #948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
npehrsson opened this issue Jun 12, 2020 · 7 comments
Closed

Unable to keep bi-directal streaming connection open #948

npehrsson opened this issue Jun 12, 2020 · 7 comments
Labels
question Further information is requested

Comments

@npehrsson
Copy link

Hi,

Not sure if it should be a question or bug.

We are using asp.net grpc to create a client-server connection. Our clients are in an Axure VMSS which we scale up which connects to an asp.net grpc server outside of the VMSS.

The clients connect and data is being streamed between the server and the clients, everything works fine, after a couple of minutes the server starts to report disconnects (it says it can't read the message and that the remote party has closed the connection).

The weird part is that the client still thinks that the connection is alive and is still subscribing to messages.

I'm pretty sure data is being transferred almost all the time, I've also changed the idle timeout on the load balancer in azure to 30 minutes (max).

I'm clueless what the issue could be, we are talking about maybe a maximum 50 streaming connections. And all works fine with 50 connections until they randomly are starting to shutdown.

What should my next step be in my investigation?

@npehrsson npehrsson added the question Further information is requested label Jun 12, 2020
@JuliusSweetland
Copy link

@npehrsson If the server is streaming to the client at that time it could be a half open connection? gRPC uses TCP, so it is possible to lose the connection between client and server (e.g. a router crashes), the server attempts to send a message and realises that the connection is broken, however, the client is waiting for a message which will never arrive. This does not time out.

This article is good: https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html

To help detect this you could send pings from client to server on a regular interval, and pings from server to client. If the connection is broken the next failed ping (or message) will discover the lost connection. You could work back from there.

@JunTaoLuo
Copy link
Contributor

This looks like an issue related to #770. Let us know if there is something here that's not covered in the linked issue.

@JuliusSweetland
Copy link

@JunTaoLuo Hi John - do you know when there might be a reply on issue #770 from the GRPC team? Thx

@npehrsson
Copy link
Author

Thanks read that thread before and thought that it didnt apply because there were some traffic on the channel.

I will try some things out from that thread, thanks for the responses.

I just want to add that locally this works fine.
So I believe it's azures load balancer that kills it, even though I raised it from 4 minutes to 30 minutes allowing idle tcp connections and that it should have had a lot of traffic during that time.

@JamesNK
Copy link
Member

JamesNK commented Jun 12, 2020

Perhaps the Azure Load Balancer has a maximum allowed connection time, and kills connections even if they still have traffic. You will need to ask them under what situations a connection is killed.

@npehrsson
Copy link
Author

npehrsson commented Jun 12, 2020 via email

@npehrsson
Copy link
Author

Adding a timer sending a keep-alive message helped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants