release-22.1: colexechash: improve memory accounting in the hash table #86321

yuzefovich · 2022-08-17T17:03:26Z

Backport 1/1 commits from #84229.
Backport 1/1 commits from #85438.

/cc @cockroachdb/release

This commit improves the memory accounting in the hash table to be more
precise in the case when the distsql_workmem limit is exhausted.
Previously, we would allocate large slices first only to perform the
memory accounting after the fact, possibly running out of budget which
would result in an error being thrown. We'd end up in a situation where
the hash table is still referencing larger newly-allocated slices while
only the previous memory usage is accounted for. This commit makes it so
that we account for the needed capacity upfront, then perform the
allocation, and then reconcile the accounting if necessary. This way
we're much more likely to encounter the budget error before making the
large allocations.

Additionally, this commit accounts for some internal slices in the hash
table used only in the hash joiner case.

This required a minor change to the way the unordered distinct spills to
disk. Previously, the memory error could only occur in two spots (and
one of those would leave the hash table in an inconsistent state and we
were "smart" in how we repaired that). However, now the memory error
could occur in more spots (and we could have several different
inconsistent states), so this commit chooses a slight performance
regression of simply rebuilding the hash table from scratch, once, when
the unordered distinct spills to disk.

Addresses: #64906.

Release justification: bug fix.

Release note: None

This commit improves the memory accounting in the hash table to be more precise in the case when the `distsql_workmem` limit is exhausted. Previously, we would allocate large slices first only to perform the memory accounting after the fact, possibly running out of budget which would result in a error being thrown. We'd end up in a situation where the hash table is still referencing larger newly-allocated slices while only the previous memory usage is accounted for. This commit makes it so that we account for the needed capacity upfront, then perform the allocation, and then reconcile the accounting if necessary. This way we're much more likely to encounter the budget error before making the large allocations. Additionally, this commit accounts for some internal slices in the hash table used only in the hash joiner case. This required a minor change to the way the unordered distinct spills to disk. Previously, the memory error could only occur in two spots (and one of those would leave the hash table in an inconsistent state and we were "smart" in how we repaired that). However, now the memory error could occur in more spots (and we could have several different inconsistent states), so this commit chooses a slight performance regression of simply rebuilding the hash table from scratch, once, when the unordered distinct spills to disk. Release note: None

blathers-crl · 2022-08-17T17:03:29Z

cockroach-teamcity · 2022-08-17T17:03:41Z

This change is

yuzefovich · 2022-08-17T17:05:35Z

Note that it includes the reduced version of #84229 (this PR doesn't eagerly release the hash table when the hash aggregator and the hash joiner spill to disk) and only includes the memory accounting fixes. It also includes the fix from #85438.

I believe this fix would partially help with https://github.com/cockroachlabs/support/issues/1762, so it's worthy of a backport.

DrewKimball

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @michae2)

yuzefovich requested review from michae2 and DrewKimball August 17, 2022 17:03

DrewKimball approved these changes Aug 17, 2022

View reviewed changes

yuzefovich merged commit 760a825 into cockroachdb:release-22.1 Aug 17, 2022

yuzefovich deleted the backport22.1-84229 branch August 17, 2022 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

release-22.1: colexechash: improve memory accounting in the hash table #86321

release-22.1: colexechash: improve memory accounting in the hash table #86321

Uh oh!

yuzefovich commented Aug 17, 2022

Uh oh!

blathers-crl bot commented Aug 17, 2022 •

edited by yuzefovich

Loading

Uh oh!

cockroach-teamcity commented Aug 17, 2022

Uh oh!

yuzefovich commented Aug 17, 2022

Uh oh!

DrewKimball left a comment

Uh oh!

Uh oh!

release-22.1: colexechash: improve memory accounting in the hash table #86321

release-22.1: colexechash: improve memory accounting in the hash table #86321

Uh oh!

Conversation

yuzefovich commented Aug 17, 2022

Uh oh!

blathers-crl bot commented Aug 17, 2022 • edited by yuzefovich Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Aug 17, 2022

Uh oh!

yuzefovich commented Aug 17, 2022

Uh oh!

DrewKimball left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

blathers-crl bot commented Aug 17, 2022 •

edited by yuzefovich

Loading