Skip to content

Update the CAGGS docs with the end_offset within the current bucket nuance #3991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 15, 2025

Conversation

atovpeko
Copy link
Contributor

@atovpeko atovpeko commented Apr 4, 2025

No description provided.

Copy link

github-actions bot commented Apr 4, 2025

Allow 10 minutes from last push for the staging site to build. If the link doesn't work, try using incognito mode instead. For internal reviewers, check web-documentation repo actions for staging build status. Link to build for this PR: http://docs-dev.timescale.com/docs-285-docs-rfc-update-caggs-docs


In addition, materializing the most recent bucket might interfere with
[real-time aggregation][future-watermark].
and extends to the beginning or end of time. If you set `end_offset` within the current time bucket, and [real-time aggregation][future-watermark] is disabled, the current time bucket is excluded. This is to improve performance: for time-series data that mostly contains writes that occur in the time stamp order, the time buckets that see lots of writes quickly have out-of-date aggregates. You get better performance by excluding the time buckets that are getting a lot of writes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: another reason is also that you cannot really refresh a partial bucket. You either compute the whole bucket or not at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added, thank you

Copy link
Contributor

@billy-the-fish billy-the-fish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a go, but that sentence is still tricky.

@@ -31,15 +31,7 @@ Among others, `add_continuous_aggregate_policy` takes the following arguments:
24 hours.

If you set the `start_offset` or `end_offset` to `NULL`, the range is open-ended
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you set the `start_offset` or `end_offset` to `NULL`, the range is open-ended
If you set `start_offset` or `end_offset` to `NULL`, the range is open-ended


In addition, materializing the most recent bucket might interfere with
[real-time aggregation][future-watermark].
and extends to the beginning or end of time. If you set `end_offset` within the current time bucket, and [real-time aggregation][future-watermark] is disabled, the current time bucket is excluded. This is because the current bucket is incomplete and can't be refreshed. Excluding the current bucket also improves performance: for time-series data that mostly contains writes that occur in the time stamp order, the time buckets that see lots of writes quickly have out-of-date aggregates. You get better performance by excluding the time buckets that are getting a lot of writes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and extends to the beginning or end of time. If you set `end_offset` within the current time bucket, and [real-time aggregation][future-watermark] is disabled, the current time bucket is excluded. This is because the current bucket is incomplete and can't be refreshed. Excluding the current bucket also improves performance: for time-series data that mostly contains writes that occur in the time stamp order, the time buckets that see lots of writes quickly have out-of-date aggregates. You get better performance by excluding the time buckets that are getting a lot of writes.
and extends to the beginning or end of time. If you set `end_offset` within the current time bucket while [real-time aggregation][future-watermark] is disabled, the current time bucket is excluded. Incomplete time buckets cannot be refreshed, this is because the current bucket is incomplete, also time buckets that see lots of writes quickly have out-of-date aggregates. Excluding the current bucket improves performance: time-series data mostly contains writes that occur in time stamp order, you get better performance by excluding the time buckets that are getting a lot of writes.

@atovpeko atovpeko merged commit bb0d5e2 into latest Apr 15, 2025
3 checks passed
@atovpeko atovpeko deleted the 285-docs-rfc-update-caggs-docs branch April 15, 2025 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants