Skip to content

DefaultConfig has dangerous defaults causing hung Reads #142

Closed as not planned
@lattwood

Description

@lattwood

The default values for Session.config.KeepAliveInterval and Session.config.ConnectionWriteTimeout of 30s and 10s create the possibility for timed out writes that most (Hashicorp Nomad included) aren't handling in their readers.

Calls to Stream.Read on one side of a connection will hang until the underlying Session is closed if the corresponding Stream.Write call on the other side it's waiting for returns with ErrConnectionWriteTimeout. This happens in the case of network congestion between the two sides.

If you keep Session.sendCh full (fixed capacity of 64) for ConnectionWriteTimeout, but for less than the KeepAliveInterval + ConnectionWriteTimeout (which would kill the Session), Stream.Write will return ErrConnectionWriteTimeout. The state of the underlying Session or Stream is not modified. When this happens (or doesn't, heh), the other side's Stream.Read call that's waiting for that write will never return because there's no timeout for this edge-case.

Since no keep alive timed out, you can continue to use the Session once the network congestion is resolved, but that Stream.Read call will only return when the Session closes or the response shows up. Since the write call on the other side timed out... well, that's a problem.

I can see three possible fixes, one heavier handed than the other-

  1. If Stream.Write times out, it should implicitly call Stream.Close while returning, to notify any blocked calls to Stream.Read on the other side return.
    • If the call to Stream.Close fails then Session.Close should get called, which it doesn't look like that happens now either.
  2. A call to Stream.Write that times out should just close the entire Session.
  3. (The hacky fix) Make the default keep alive interval less than the write timeout

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions