Skip to content

In Distribtor, pre-allocate buffer for reading protobufs #1719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 23, 2019

Conversation

bboreham
Copy link
Contributor

@bboreham bboreham commented Oct 7, 2019

ReadAll() will start at 512 bytes then successively double the buffer, copying the contents each time. If the caller supplies a size we can be much more efficient. It's not a massive win, but every little helps.

(However, doing this creates a new DoS vector - supplying an fake size that is way too big. I'm thinking we should have a max input size, similar to gRPC)

ReadAll() will start at 512 bytes then successively double the buffer,
copying the contents each time. If the caller supplies a size we can
be much more efficient.

Signed-off-by: Bryan Boreham <[email protected]>
@bboreham bboreham changed the title Pre-allocate buffer for reading protobufs, for performance In Distribtor, pre-allocate buffer for reading protobufs Oct 7, 2019
@tomwilkie
Copy link
Contributor

I'm thinking we should have a max input size, similar to gRPC

+1

Do we have a benchmark that shows an improvement?

@bboreham
Copy link
Contributor Author

bboreham commented Oct 7, 2019

Nope, no benchmark. In my prod distributors this ReadAll() is 5.5% of all garbage, and this change brings that down over half, so expect a 3% improvement overall?

@bboreham
Copy link
Contributor Author

bboreham commented Oct 7, 2019

This is the compressed data - if we could use a pool of buffers for the uncompressed bytes, that would be much bigger impact, but I'm not sure whether Unmarshal() retains any of the memory.
(and it gives the buffer to the "billing" code but I want to remove that anyway)

@khaines
Copy link
Contributor

khaines commented Oct 7, 2019

I'm thinking we should have a max input size, similar to gRPC

also +1 on this.

And extend the test to check it fires.

Signed-off-by: Bryan Boreham <[email protected]>
@bboreham bboreham force-pushed the dist-content-length branch from 3703686 to 99fd4d3 Compare October 18, 2019 11:13
@bboreham
Copy link
Contributor Author

I added a size limit.

Copy link
Contributor

@khaines khaines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bboreham bboreham merged commit bc76736 into master Oct 23, 2019
@bboreham bboreham deleted the dist-content-length branch October 23, 2019 20:18
@@ -125,6 +125,7 @@ type Config struct {

HATrackerConfig HATrackerConfig `yaml:"ha_tracker,omitempty"`

MaxRecvMsgSize int `yaml:"max_send_msg_size"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The yaml here is max_send_msg_size, while the CLI flag is max-recv-msg-size. Isn't contradictory?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, could you open a PR/issue so that we don't lose track of this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure: #1755

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, sorry!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants