[SPARK-49133][CORE] Make member MemoryConsumer#used
atomic to avoid user code causing deadlock
#51849
+11
−10
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Turn the field
MemoryConsumer#used
fromlong
toAtomicLong
.Why are the changes needed?
MemoryConsumer
doesn't provide internal thread-safety so developer should add their own lock for concurrent memory allocation in the same task.Thinking of multiple threads are allocating memory in the same task (although it's a special case regarding Spark's memory model), to protect the thread-safety of MemoryConsumer, user has to lock the API invocations of it. In this case, if one memory consumer spills another concurrently, there's a risk of ABBA deadlock. E.g.,
Deadlock happens at the moment thread 1 locks consumer A and acquires TMM's lock, while consumer B locks TMM then acquires A's lock.
To fix this problem, Spark could ensure MemoryConsumer's thread-safety with an atomic
MemoryConsumer#used
, so user doesn't have to add a lock in most cases.Does this PR introduce any user-facing change?
A developer change:
will become
To address this, developers could call
getUsed()
for all Spark versions instead (if they need to read the value ofused
), without having to maintain a shim layer for this change.How was this patch tested?
No need to test from Spark code. But fine to add a case to emulate developer's calls if preferred.
Was this patch authored or co-authored using generative AI tooling?
No.