Add planner filter #4318

ac1214 · 2021-06-24T20:14:51Z

What this PR does:

Implements generation of parallelize plans for the proposal outlined in #4272. Currently the parallelizable plans are generated but only the first plan in the plans list is selected to run.

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Albert <[email protected]>

ac1214 · 2021-06-28T17:12:04Z

Testing done for these changes

To test these changes I collected Prometheus metrics for a couple of days without any compaction. After generating some uncompacted blocks, I saved them to be able to replicate the same compaction multiple times. I downloaded blocks from s3 to be able to replicate testing completed and not delete blocks in s3.
The testing I completed to ensure that this compaction method works is described below.

Tested by querying same Prometheus metrics before and after compaction and comparing the query results to ensure that the data had remained the same.
Ran Prometheus’ tsdb analyze tool to compare the data in the blocks before and after compaction.
Compared the resulting blocks from using the current compaction method with the compaction/grouping method introduced in this PR. I compared the minTime, maxTime, and stats attributes between the blocks of the two different compactions to find matching blocks. After finding matching blocks, I compared the md5 sum of the chunks of the blocks with matching minTime, maxTime, and stats attributes to ensure that the data was identical.

Signed-off-by: Albert <[email protected]>

bboreham · 2021-06-29T11:25:21Z

Sorry but we have begun the process of cutting a new release; please rebase from master and move your CHANGELOG entry to the top under ## master / unreleased

Signed-off-by: Albert <[email protected]>

pracucci · 2021-07-01T08:09:05Z

pkg/compactor/compactor.go

 		"If 0, blocks will be deleted straight away. Note that deleting blocks immediately can cause query failures.")
 	f.DurationVar(&cfg.TenantCleanupDelay, "compactor.tenant-cleanup-delay", 6*time.Hour, "For tenants marked for deletion, this is time between deleting of last block, and doing final cleanup (marker files, debug files) of the tenant.")
 	f.BoolVar(&cfg.BlockDeletionMarksMigrationEnabled, "compactor.block-deletion-marks-migration-enabled", true, "When enabled, at compactor startup the bucket will be scanned and all found deletion marks inside the block location will be copied to the markers global location too. This option can (and should) be safely disabled as soon as the compactor has successfully run at least once.")
+	f.BoolVar(&cfg.PlannerFilterEnabled, "compactor.planner-filter-enabled", false, "Filter and plan blocks within PlannerFilter instead of through Thanos planner and grouper.")


I don't think the config option should be that specific (what this CLI flag describes is an internal implementation detail). The whole purpose of the #4272 proposal is to introduce a different sharding strategy for the compactor. To keep it consistent with other Cortex services, the config option could be compactor.sharding-strategy with values default (the current one) and shuffle-sharding (the new one you're working on).

pracucci · 2021-07-01T08:15:03Z

pkg/compactor/compactor.go

+		level.Info(c.logger).Log("msg", "Compactor using planner filter")
+
+		// Create a new planner filter
+		f, err := NewPlannerFilter(


I don't think this is the right way to build it. It's not the responsability of the metadata fetcher to run the planning and filter out blocks belonging to other shards. It's not how the compactor was designed. You should build this feature working on the compactor grouper and planner.

Add planner filter

16024c5

Signed-off-by: Albert <[email protected]>

pull-request-size bot added the size/XL label Jun 24, 2021

Update changelog

e24bb10

Signed-off-by: Albert <[email protected]>

Merge master

a7b3f0c

Signed-off-by: Albert <[email protected]>

pracucci reviewed Jul 1, 2021

View reviewed changes

ac1214 closed this Jul 10, 2021

ac1214 mentioned this pull request Jul 14, 2021

Add shuffle sharding grouper/planner #4357

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add planner filter #4318

Add planner filter #4318

Uh oh!

ac1214 commented Jun 24, 2021 •

edited

Loading

Uh oh!

ac1214 commented Jun 28, 2021 •

edited

Loading

Uh oh!

bboreham commented Jun 29, 2021

Uh oh!

pracucci Jul 1, 2021

Uh oh!

pracucci Jul 1, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add planner filter #4318

Add planner filter #4318

Uh oh!

Conversation

ac1214 commented Jun 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ac1214 commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bboreham commented Jun 29, 2021

Uh oh!

pracucci Jul 1, 2021

Choose a reason for hiding this comment

Uh oh!

pracucci Jul 1, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ac1214 commented Jun 24, 2021 •

edited

Loading

ac1214 commented Jun 28, 2021 •

edited

Loading