Skip to content

Ingester flush should balance across tables and hashes #724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bboreham opened this issue Feb 26, 2018 · 2 comments
Closed

Ingester flush should balance across tables and hashes #724

bboreham opened this issue Feb 26, 2018 · 2 comments

Comments

@bboreham
Copy link
Contributor

Currently flushing is driven off a priority queue in order of the start time of the first unflushed chunk in a series.

This is especially bad when we’ve scaled down write capacity on the previous table and someone sends some samples from that time.

It’s also bad if there are a lot of timeseries that map to the same index hash (i.e. same metric name for same user on same day), because that creates a hot-spot in the data store.

It would be better to have separate queues for each table, and to send a spread of data which is known to have different hash keys. Even better if you can batch it up to reduce network overhead to the back end.

@gouthamve
Copy link
Contributor

related #684

@bboreham bboreham added the storage/chunks Chunks storage engine label Jun 10, 2021
@alanprot
Copy link
Member

alanprot commented Aug 9, 2022

Closing as chunks is deprecated.

@alanprot alanprot closed this as completed Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants