Skip to content

fix regression of result cache unable to parse cached results #6196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 8, 2024

Conversation

yeya24
Copy link
Contributor

@yeya24 yeya24 commented Sep 5, 2024

What this PR does:

This PR fixes a regression introduced in #6180.

ts=2024-09-05T06:02:52.857844346Z caller=handler.go:409 level=error  msg="query stats" component=query-frontend method=GET path=/prometheus/api/v1/query_range error="any: message type \"queryrange.PrometheusResponse\" isn't linked in" param_stats=all param_step=60 param_end=2024-09-04T06:08:00Z param_start=2024-09-04T06:02:00Z param_query=some_query

any: message type \"queryrange.PrometheusResponse\" isn't linked in error came from https://github.com/cortexproject/cortex/blob/v1.18.0/pkg/querier/tripperware/queryrange/results_cache.go#L733. When serializing query response into cacheable content the protobuf type name is encoded. So when fetching cached response back and deserializing it again it requires us to have the queryrange.PrometheusResponse type so it knows how to unmarshal it.

The type was removed in the above PR. To fix this issue, we have to introduce queryrange.PrometheusResponse type back for compatibility purpose. The cache should be able to handle different cached result types and always convert result type to the new one when forwarding the response to downstream.

Besides, I also ensure that we always write old response format to cache for backward compatiblity. So things don't break when there is a rollback and old image cannot parse the cached results.
The idea is to introduce a flag later so that users can migrate with a 2 phase deployment.

Which issue(s) this PR fixes:
Fixes #

Omit the changelog as the regression was not released yet.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@yeya24 yeya24 force-pushed the fix-query-range-cache branch from 670658d to 286bb23 Compare September 6, 2024 01:06
@yeya24 yeya24 force-pushed the fix-query-range-cache branch from 286bb23 to 4273be3 Compare September 6, 2024 01:32
@CharlieTLe
Copy link
Member

Thanks @yeya24! Could you please update the change log as well?

@yeya24
Copy link
Contributor Author

yeya24 commented Sep 8, 2024

I think changelog is not needed in this case as the issue was introduced in master and has not been released.

@yeya24
Copy link
Contributor Author

yeya24 commented Sep 8, 2024

Will merge as it is

@yeya24 yeya24 merged commit bc69e73 into cortexproject:master Sep 8, 2024
16 checks passed
@yeya24 yeya24 deleted the fix-query-range-cache branch September 9, 2024 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants