Skip to content

Conversation

alanprot
Copy link
Member

@alanprot alanprot commented May 28, 2025

What this PR does:

Introduces several improvements to the Parquet implementation:

  • Adds fallback to the blocks store queryable when vertical sharding is enabled, as this is currently unsupported in the Parquet queryable.
  • Sorts blocks by timestamp in the converter to prioritize recent blocks during conversion
  • Adds support for configuring maxRowGroupSize and flag to enabled/disableon-disk buffer in the converter.
  • Randomizes tenant order during conversion to improve fairness.
  • Pull latest parquet common

Which issue(s) this PR fixes:
Part of #6712

Checklist

  • Tests updated
  • [NA] Documentation added
  • [NA] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@alanprot alanprot marked this pull request as ready for review May 28, 2025 00:06
@dosubot dosubot bot added component/querier storage/blocks Blocks storage engine labels May 28, 2025
@alanprot alanprot merged commit 469ae45 into cortexproject:master May 29, 2025
17 checks passed
@alanprot alanprot deleted the fix-parquet branch May 29, 2025 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants