Skip to content

colexec: improve handling of the metadata #85285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 29, 2022

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jul 28, 2022

colmem: introduce a helper to release all reservations from allocator

Release note: None

This commit audits all of the places where we're operating with the
producer metadata and improves things a bit. This was prompted by the
fact that some types of the metadata (in particular, the
LeafTxnFinalState) can be of non-trivial footprint, so the sooner we
lose the reference to the metadata objects, the more stable CRDB will
be.

colexec: improve handling of the metadata

This commit adds the memory accounting for the LeafTxnFinalState
metadata objects in most (all?) places that buffer metadata (inbox,
columnarizer, materializer, parallel unordered synchronizer). The
reservations are released whenever the buffered component is drained,
however, the drainer - who takes over the metadata - might not perform
the accounting. The difficulty there is that the lifecycle of the
metadata objects are not super clear: it's possible that we're at the
end of the execution, and the whole plan is being drained - in such a
scenario, the metadata is pushed into the DistSQLReceiver and then
imported into the root txn and is discarded (modulo the increased root
txn footprint); it's also possible that the metadata from the drained
component gets buffered somewhere up the tree. But this commit adds the
accounting in most such places.

Addresses: #64906.
Addresses: #81451.

Release note: None

@yuzefovich yuzefovich requested review from michae2, DrewKimball and a team July 28, 2022 22:59
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)


pkg/sql/colflow/colrpc/inbox.go line 502 at r1 (raw file):

	// as well as the metadata), and when this function returns, we no longer
	// reference any of those, so we can release all of the allocations.
	defer i.allocator.ReleaseMemory(i.allocator.Used())

This will release memory allocated earlier in Next - is this ok? Alternatively, should we be releasing memory elsewhere as well, or does it not matter?

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)


pkg/sql/colflow/colrpc/inbox.go line 502 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

This will release memory allocated earlier in Next - is this ok? Alternatively, should we be releasing memory elsewhere as well, or does it not matter?

Yeah, that's the intention - the lifecycle of the Inbox is such that many calls of Next are followed by a single call to DrainMeta at which point the inbox is no longer used. At the end of DrainMeta the Inbox loses references to all objects it accounted for, so it can release all allocations.

I wonder whether your questions partially come from the comments on the DrainMeta that are now somewhat stale and misleading - Next and DrainMeta are expected to be called only from the same goroutine, so they won't ever be called "concurrently". We have two sources of concurrency - hash routers and parallel unordered synchronizers, and I think in the original implementation of the parallel unordered synchronizer Next was called from each input ("child") goroutine with DrainMeta called from the coordinator ("parent") goroutine, so we needed some protection. Since then we have changed the implementation so that each input goroutine calls both Next and DrainMeta on its input subtree. (In the case of the hash router I think we always called both functions from the main "coordinator" goroutine.)

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)


pkg/sql/colflow/colrpc/inbox.go line 502 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Yeah, that's the intention - the lifecycle of the Inbox is such that many calls of Next are followed by a single call to DrainMeta at which point the inbox is no longer used. At the end of DrainMeta the Inbox loses references to all objects it accounted for, so it can release all allocations.

I wonder whether your questions partially come from the comments on the DrainMeta that are now somewhat stale and misleading - Next and DrainMeta are expected to be called only from the same goroutine, so they won't ever be called "concurrently". We have two sources of concurrency - hash routers and parallel unordered synchronizers, and I think in the original implementation of the parallel unordered synchronizer Next was called from each input ("child") goroutine with DrainMeta called from the coordinator ("parent") goroutine, so we needed some protection. Since then we have changed the implementation so that each input goroutine calls both Next and DrainMeta on its input subtree. (In the case of the hash router I think we always called both functions from the main "coordinator" goroutine.)

I guess I just wasn't sure whether DrainMeta is always called at the end, but you've answered that question.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @michae2)


pkg/sql/colflow/colrpc/inbox.go line 502 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

I guess I just wasn't sure whether DrainMeta is always called at the end, but you've answered that question.

(This comment was written before your reply.)

Note that it is definitely the case that once Inbox.DrainMeta returns, the memory usage of the metadata is not accounted for because no user of the Inbox does the accounting of it. I alluded to the difficulty there in the commit message - the Inbox doesn't know how long that metadata will be alive - it could be processed right away, or it could be buffered somewhere up the tree. It is still an improvement that we're now accounting for the metadata at least throughout the lifetime of the Inbox.

Typing this comment out made me realize that we can perform the accounting for the metadata in most (all?) operators that buffer the metadata, so I just added that. PTAL.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @michae2, and @yuzefovich)


pkg/sql/colexec/colbuilder/execplan.go line 73 at r2 (raw file):

	inputs []colexecargs.OpWithMetaInfo,
	inputTypes [][]*types.T,
	testStreamingMemAcc *mon.BoundAccount,

[nit] could these just be combined, so in the testing case streamingMemAccFactory would just return testStreamingMemAcc?

This commit audits all of the places where we're operating with the
producer metadata and improves things a bit. This was prompted by the
fact that some types of the metadata (in particular, the
LeafTxnFinalState) can be of non-trivial footprint, so the sooner we
lose the reference to the metadata objects, the more stable CRDB will
be.

This commit adds the memory accounting for the LeafTxnFinalState
metadata objects in most (all?) places that buffer metadata (inbox,
columnarizer, materializer, parallel unordered synchronizer). The
reservations are released whenever the buffered component is drained,
however, the drainer - who takes over the metadata - might not perform
the accounting. The difficulty there is that the lifecycle of the
metadata objects are not super clear: it's possible that we're at the
end of the execution, and the whole plan is being drained - in such a
scenario, the metadata is pushed into the DistSQLReceiver and then
imported into the root txn and is discarded (modulo the increased root
txn footprint); it's also possible that the metadata from the drained
component gets buffered somewhere up the tree. But this commit adds the
accounting in most such places.

Release note: None
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @michae2)


pkg/sql/colexec/colbuilder/execplan.go line 73 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] could these just be combined, so in the testing case streamingMemAccFactory would just return testStreamingMemAcc?

Good point, done.

@craig
Copy link
Contributor

craig bot commented Jul 29, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants