DOC-734 | AQL optimization: COLLECT ... AGGREGATE can utilize persistent index #732

Simran-B · 2025-07-01T15:43:21Z

Description

index collect aggregations arangodb#21617

TODO:

Is the performance benefit higher the fewer distinct values there are?
Is the optimization skipped if there too many different values (low selectivity)?
Other limitations, like sparse indexes or pre/post sort?

Upstream PRs

3.10:
3.11:
3.12:
3.13:

arangodb-docs-automation · 2025-07-01T15:43:24Z

Deploy Preview Available Via
https://deploy-preview-732--docs-hugo.netlify.app

jvolmer · 2025-07-09T08:47:32Z

To your questions:

Is the performance benefit higher the fewer distinct values there are?

We saw performance gains of around two for different numbers of values (n) and low number of distinct values (k). This gain stays constant for a lot of k values, decreases when k comes close to n and is zero when k=n. In the cluster the gain decrease happens already for lower k than in the single server case. (this behaviour can be seen in the diagrams in arangodb/arangodb#21617)

Is the optimization skipped if there too many different values (low selectivity)?

No, this optimization is not skipped based on the selectivity value - opposed to the usage of use-index-for-collect for a collect without an aggregation.

Other limitations, like sparse indexes or pre/post sort?

Yes, the optimization does not support sparse indexes, aggregation expressions with variables different than the document variable and aggregation expressions with no in-variable. I'm not sure what you mean with pre/post sort.

jvolmer · 2025-07-09T07:55:41Z

site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md

+Reading the data from the index instead of the stored documents for aggregations
+can significantly increase the perform if the there are few different values.


Suggested change

Reading the data from the index instead of the stored documents for aggregations

can significantly increase the perform if the there are few different values.

Reading the data from the index instead of the stored documents for aggregations

can increase the performance by a factor of two.

AQL optimization: COLLECT ... AGGREGATE can utilize persistent index

2035995

Simran-B self-assigned this Jul 1, 2025

Simran-B added this to the 3.12.5 milestone Jul 1, 2025

cla-bot bot added the cla-signed label Jul 1, 2025

Simran-B changed the title ~~AQL optimization: COLLECT ... AGGREGATE can utilize persistent index~~ DOC-734 | AQL optimization: COLLECT ... AGGREGATE can utilize persistent index Jul 1, 2025

Simran-B requested a review from jvolmer July 1, 2025 16:10

jvolmer reviewed Jul 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DOC-734 | AQL optimization: COLLECT ... AGGREGATE can utilize persistent index #732

DOC-734 | AQL optimization: COLLECT ... AGGREGATE can utilize persistent index #732

Simran-B commented Jul 1, 2025 •

edited

Loading

Uh oh!

arangodb-docs-automation bot commented Jul 1, 2025

Uh oh!

jvolmer commented Jul 9, 2025 •

edited

Loading

Uh oh!

jvolmer Jul 9, 2025

Uh oh!

Uh oh!

		Reading the data from the index instead of the stored documents for aggregations
		can significantly increase the perform if the there are few different values.

DOC-734 | AQL optimization: COLLECT ... AGGREGATE can utilize persistent index #732

Are you sure you want to change the base?

DOC-734 | AQL optimization: COLLECT ... AGGREGATE can utilize persistent index #732

Conversation

Simran-B commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Upstream PRs

Uh oh!

arangodb-docs-automation bot commented Jul 1, 2025

Uh oh!

jvolmer commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jvolmer Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Simran-B commented Jul 1, 2025 •

edited

Loading

jvolmer commented Jul 9, 2025 •

edited

Loading