Cache Expended Posting on ingesters #6296

alanprot · 2024-10-30T21:29:20Z

What this PR does:
ThisPR introduce caching expanded postings, similar to the approach used in SG, as seen in this change.

Key Updates:
1. Compacted Block Caching:
  - Expanded postings are cached to all queries for compacted blocks. This is safe as compacted blocks are immutable.
2. Head Block Caching:
  - Caching is applied only to select queries containing an equal matcher for the metric name (name).
    For such cases, a seed is added to the cache key, allowing invalidation whenever a new series is added or removed for a given metric name.
  - Example:
    - Suppose the seed for cpu_utilization is 21. Queries would generate cache keys as follows:
      - sum(cpu_utilization{service="A"}) => Key: 21|head|__name__=cpu_utilization|service=A
      - sum(cpu_utilization{service="A", op="GET"}) => Key: 21|head|__name__=cpu_utilization|service=A|op=GET
3. Head Cache Invalidation:
  On data ingestion, if a new series is added with cpu_utilization as the metric name, the seed changes, invalidating all cached entries for that metric.
4. Memory Bound for Seeds:
  To control memory usage, a fixed-size seed slice is used. Metric names are hashed to determine their respective seed.

In load testing, this change demonstrated significant CPU and latency improvements on ingesters experiencing heavy query loads from rulers. In such cases, the same queries are evaluated constantly (every evaluation interval), and the postings are expanded for every iteration, leading to unnecessarily high CPU usage.

This are some graphs:

Flame Graphs Before and After (we can see that the after the CPU is being used mostly for the Push method, where before was mostly being used by the QueryStream method:

The idea was inspired on thanos-io/thanos#6420 and the promise based cache implementation by grafana/mimir-prometheus#14

Which issue(s) this PR fixes:
Fixes #

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

justinjung04 · 2024-11-02T17:44:34Z

I love the test results! Thanks for doing this!!

docs/blocks-storage/querier.md

pkg/ingester/ingester_test.go

pkg/storage/tsdb/expanded_postings_cache.go

Signed-off-by: alanprot <[email protected]>

…enabled Signed-off-by: alanprot <[email protected]>

docs/blocks-storage/querier.md

Signed-off-by: alanprot <[email protected]>

…ready found keys. (cortexproject#6312) * Creating a test to show the race on the multilevel cache Signed-off-by: alanprot <[email protected]> * fix the race problem * Only fetch keys that were not found on the previous cache Signed-off-by: alanprot <[email protected]> --------- Signed-off-by: alanprot <[email protected]>

Signed-off-by: alanprot <[email protected]>

…tric Signed-off-by: alanprot <[email protected]>

pkg/storage/tsdb/multilevel_chunk_cache.go

pkg/ingester/ingester.go

* Implementing Expanded Postings Cache Signed-off-by: alanprot <[email protected]> * small nit Signed-off-by: alanprot <[email protected]> * refactoring the cache so we dont need to call expire on every request Signed-off-by: alanprot <[email protected]> * Update total cache size when updating the item Signed-off-by: alanprot <[email protected]> * Fix fuzzy test after change the flag name Signed-off-by: alanprot <[email protected]> * remove max item config + create a new test case with only head cache enabled Signed-off-by: alanprot <[email protected]> * Documenting enabled as first field on the config Signed-off-by: alanprot <[email protected]> * Fix race on chunks multilevel cache + Optimize to avoid refetching already found keys. (cortexproject#6312) * Creating a test to show the race on the multilevel cache Signed-off-by: alanprot <[email protected]> * fix the race problem * Only fetch keys that were not found on the previous cache Signed-off-by: alanprot <[email protected]> --------- Signed-off-by: alanprot <[email protected]> * Improve Doc Signed-off-by: alanprot <[email protected]> * create new cortex_ingester_expanded_postings_non_cacheable_queries metric Signed-off-by: alanprot <[email protected]> --------- Signed-off-by: alanprot <[email protected]>

pull-request-size bot added the size/XXL label Oct 30, 2024

alanprot force-pushed the cache-postings branch 14 times, most recently from d33a2a8 to 9cdddd1 Compare November 2, 2024 00:01

alanprot changed the title ~~[Wip] Cache Expended Posting on ingesters~~ Cache Expended Posting on ingesters Nov 2, 2024

alanprot marked this pull request as ready for review November 2, 2024 00:19

dosubot bot added component/ingester type/performance labels Nov 2, 2024

alanprot force-pushed the cache-postings branch 2 times, most recently from a5e3830 to 6937829 Compare November 2, 2024 01:05

yeya24 mentioned this pull request Nov 2, 2024

[WIP] Overriding chunks querier. #6293

Closed

3 tasks

justinjung04 reviewed Nov 3, 2024

View reviewed changes

alanprot added 5 commits November 4, 2024 14:24

Implementing Expanded Postings Cache

cf48270

Signed-off-by: alanprot <[email protected]>

small nit

b88cf6b

Signed-off-by: alanprot <[email protected]>

refactoring the cache so we dont need to call expire on every request

132de4b

Signed-off-by: alanprot <[email protected]>

Update total cache size when updating the item

c153cb1

Signed-off-by: alanprot <[email protected]>

Fix fuzzy test after change the flag name

be467a7

Signed-off-by: alanprot <[email protected]>

alanprot force-pushed the cache-postings branch from 6937829 to 34adf2a Compare November 4, 2024 22:29

remove max item config + create a new test case with only head cache …

d8d226d

…enabled Signed-off-by: alanprot <[email protected]>

alanprot force-pushed the cache-postings branch from 34adf2a to d8d226d Compare November 4, 2024 22:32

alanprot commented Nov 4, 2024

View reviewed changes

docs/blocks-storage/querier.md Show resolved Hide resolved

yeya24 reviewed Nov 4, 2024

View reviewed changes

docs/blocks-storage/querier.md Show resolved Hide resolved

docs/blocks-storage/querier.md Show resolved Hide resolved

alanprot force-pushed the cache-postings branch 2 times, most recently from d443d80 to 42e2235 Compare November 4, 2024 22:48

Documenting enabled as first field on the config

41aad71

Signed-off-by: alanprot <[email protected]>

alanprot force-pushed the cache-postings branch from 42e2235 to 41aad71 Compare November 4, 2024 22:50

alanprot and others added 3 commits November 5, 2024 10:33

Improve Doc

37c7a28

Signed-off-by: alanprot <[email protected]>

create new cortex_ingester_expanded_postings_non_cacheable_queries me…

827e8fb

…tric Signed-off-by: alanprot <[email protected]>

yeya24 approved these changes Nov 5, 2024

View reviewed changes

pkg/storage/tsdb/multilevel_chunk_cache.go Show resolved Hide resolved

pkg/ingester/ingester.go Show resolved Hide resolved

dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 5, 2024

alanprot merged commit 3267515 into cortexproject:master Nov 5, 2024
16 checks passed

alanprot mentioned this pull request Nov 6, 2024

Fix: PostingCache promise should fetch data only once #6314

Merged

1 task

This was referenced Dec 11, 2024

Expanded Postings Cache can cache results without the nearly created series under high load. #6417

Merged

Call PostCreation callback only after the new series is added to the Postings prometheus/prometheus#15579

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cache Expended Posting on ingesters #6296

Cache Expended Posting on ingesters #6296

Uh oh!

alanprot commented Oct 30, 2024 •

edited

Loading

Uh oh!

justinjung04 commented Nov 2, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Cache Expended Posting on ingesters #6296

Cache Expended Posting on ingesters #6296

Uh oh!

Conversation

alanprot commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinjung04 commented Nov 2, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alanprot commented Oct 30, 2024 •

edited

Loading