-
Notifications
You must be signed in to change notification settings - Fork 816
Distributing sum queries #1878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
pracucci
merged 26 commits into
cortexproject:master
from
owen-d:feature/query-sharding-squashed
Feb 21, 2020
Merged
Distributing sum queries #1878
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
de8f330
querier.sum-shards
owen-d 5fb7c29
addresses pr comments
owen-d afdd99d
instruments frontend sharding, splitby
owen-d 6bf5f20
LabelsSeriesID unexported again
owen-d 3e28014
removes unnecessary codec interface in astmapping
owen-d 5aa58bc
simplifies VectorSquasher as we never use matrices
owen-d 26e488b
combines queryrange series & value files
owen-d ba07d5b
removes noops struct embedding strategy in schema, provides noop impl…
owen-d 025f87d
NewSubtreeFolder no longer can return an error as it inlines the json…
owen-d 09ac713
account for QueryIngestersWithin renaming
owen-d dc629c1
fixes rebase import collision
owen-d b9a2b67
fixes rebase conflicts
owen-d 5ac6309
-marks absent as non parallelizable
owen-d ddd47b4
upstream promql compatibility changes
owen-d 07e894c
addresses pr comments
owen-d 4c6f40b
import collisions
owen-d 42851ff
linting - fixes goimports -local requirement
owen-d a4dcfff
Merge remote-tracking branch 'upstream/master' into feature/query-sha…
owen-d 7347d45
fixes merge conflicts
owen-d 10dd597
addresses pr comments
owen-d a309f62
stylistic changes
owen-d bc3f7f5
s/downstream/sharded/
owen-d 1a159f0
s/sum_shards/parallelise_shardable_queries/
owen-d 65718db
query-audit docs
owen-d 7e56489
notes sharded parallelizations are only supported by chunk store
owen-d e6cb0b1
doc suggestions
owen-d File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
--- | ||
title: "Query Auditor (tool)" | ||
linkTitle: "Query Auditor (tool)" | ||
weight: 2 | ||
slug: query-auditor | ||
--- | ||
|
||
The query auditor is a tool bundled in the Cortex repository, but **not** included in Docker images -- this must be built from source. It's primarily useful for those _developing_ Cortex, but can be helpful to operators as well during certain scenarios (backend migrations come to mind). | ||
|
||
## How it works | ||
|
||
The `query-audit` tool performs a set of queries against two backends that expose the Prometheus read API. This is generally the `query-frontend` component of two Cortex deployments. It will then compare the differences in the responses to determine the average difference for each query. It does this by: | ||
|
||
- Ensuring the resulting label sets match. | ||
- For each label set, ensuring they contain the same number of samples as their pair from the other backend. | ||
- For each sample, calculates their difference against it's pair from the other backend/label set. | ||
- Calculates the average diff per query from the above diffs. | ||
owen-d marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Limitations | ||
|
||
It currently only supports queries with `Matrix` response types. | ||
|
||
### Use cases | ||
|
||
- Correctness testing when working on the read path. | ||
- Comparing results from different backends. | ||
|
||
### Example Configuration | ||
|
||
```yaml | ||
control: | ||
host: http://localhost:8080/api/prom | ||
headers: | ||
"X-Scope-OrgID": 1234 | ||
|
||
test: | ||
host: http://localhost:8081/api/prom | ||
headers: | ||
"X-Scope-OrgID": 1234 | ||
|
||
queries: | ||
- query: 'sum(rate(container_cpu_usage_seconds_total[5m]))' | ||
start: 2019-11-25T00:00:00Z | ||
end: 2019-11-28T00:00:00Z | ||
step_size: 15m | ||
- query: 'sum(rate(container_cpu_usage_seconds_total[5m])) by (container_name)' | ||
start: 2019-11-25T00:00:00Z | ||
end: 2019-11-28T00:00:00Z | ||
step_size: 15m | ||
- query: 'sum(rate(container_cpu_usage_seconds_total[5m])) without (container_name)' | ||
start: 2019-11-25T00:00:00Z | ||
end: 2019-11-26T00:00:00Z | ||
step_size: 15m | ||
- query: 'histogram_quantile(0.9, sum(rate(cortex_cache_value_size_bytes_bucket[5m])) by (le, job))' | ||
start: 2019-11-25T00:00:00Z | ||
end: 2019-11-25T06:00:00Z | ||
step_size: 15m | ||
# two shardable legs | ||
- query: 'sum without (instance, job) (rate(cortex_query_frontend_queue_length[5m])) or sum by (job) (rate(cortex_query_frontend_queue_length[5m]))' | ||
start: 2019-11-25T00:00:00Z | ||
end: 2019-11-25T06:00:00Z | ||
step_size: 15m | ||
# one shardable leg | ||
- query: 'sum without (instance, job) (rate(cortex_cache_request_duration_seconds_count[5m])) or rate(cortex_cache_request_duration_seconds_count[5m])' | ||
start: 2019-11-25T00:00:00Z | ||
end: 2019-11-25T06:00:00Z | ||
step_size: 15m | ||
``` | ||
|
||
### Example Output | ||
|
||
Under ideal circumstances, you'll see output like the following: | ||
|
||
``` | ||
$ go run ./tools/query-audit/ -f config.yaml | ||
|
||
0.000000% avg diff for: | ||
query: sum(rate(container_cpu_usage_seconds_total[5m])) | ||
series: 1 | ||
samples: 289 | ||
start: 2019-11-25 00:00:00 +0000 UTC | ||
end: 2019-11-28 00:00:00 +0000 UTC | ||
step: 15m0s | ||
|
||
0.000000% avg diff for: | ||
query: sum(rate(container_cpu_usage_seconds_total[5m])) by (container_name) | ||
series: 95 | ||
samples: 25877 | ||
start: 2019-11-25 00:00:00 +0000 UTC | ||
end: 2019-11-28 00:00:00 +0000 UTC | ||
step: 15m0s | ||
|
||
0.000000% avg diff for: | ||
query: sum(rate(container_cpu_usage_seconds_total[5m])) without (container_name) | ||
series: 4308 | ||
samples: 374989 | ||
start: 2019-11-25 00:00:00 +0000 UTC | ||
end: 2019-11-26 00:00:00 +0000 UTC | ||
step: 15m0s | ||
|
||
0.000000% avg diff for: | ||
query: histogram_quantile(0.9, sum(rate(cortex_cache_value_size_bytes_bucket[5m])) by (le, job)) | ||
series: 13 | ||
samples: 325 | ||
start: 2019-11-25 00:00:00 +0000 UTC | ||
end: 2019-11-25 06:00:00 +0000 UTC | ||
step: 15m0s | ||
|
||
0.000000% avg diff for: | ||
query: sum without (instance, job) (rate(cortex_query_frontend_queue_length[5m])) or sum by (job) (rate(cortex_query_frontend_queue_length[5m])) | ||
series: 21 | ||
samples: 525 | ||
start: 2019-11-25 00:00:00 +0000 UTC | ||
end: 2019-11-25 06:00:00 +0000 UTC | ||
step: 15m0s | ||
|
||
0.000000% avg diff for: | ||
query: sum without (instance, job) (rate(cortex_cache_request_duration_seconds_count[5m])) or rate(cortex_cache_request_duration_seconds_count[5m]) | ||
series: 942 | ||
samples: 23550 | ||
start: 2019-11-25 00:00:00 +0000 UTC | ||
end: 2019-11-25 06:00:00 +0000 UTC | ||
step: 15m0s | ||
|
||
0.000000% avg diff for: | ||
query: sum by (namespace) (predict_linear(container_cpu_usage_seconds_total[5m], 10)) | ||
series: 16 | ||
samples: 400 | ||
start: 2019-11-25 00:00:00 +0000 UTC | ||
end: 2019-11-25 06:00:00 +0000 UTC | ||
step: 15m0s | ||
|
||
0.000000% avg diff for: | ||
query: sum by (namespace) (avg_over_time((rate(container_cpu_usage_seconds_total[5m]))[10m:]) > 1) | ||
series: 4 | ||
samples: 52 | ||
start: 2019-11-25 00:00:00 +0000 UTC | ||
end: 2019-11-25 01:00:00 +0000 UTC | ||
step: 5m0s | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.