-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-110481: Implement _Py_DECREF_NO_DEALLOC for free-threaded build #111560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Py_ssize_t refcount = _Py_atomic_load_ssize_relaxed(&op->ob_ref_shared); | ||
Py_ssize_t new_shared; | ||
// Shared refcount can be zero but we should consider local refcount. | ||
int should_queue = (refcount == 0 || refcount == _Py_REF_MAYBE_WEAKREF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colesbury Out of curiosity, Don't we have to consider that shared_refcount can be a negative value at this moment due to imbalance refcounting?
Same question for the
Line 319 in 2445673
should_queue = (shared == 0 || shared == _Py_REF_MAYBE_WEAKREF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully understand your question. Yes, shared_refcount
may be negative. should_queue
is always false if shared
is negative because:
- we only queue objects once
- we queue them the first time the shared refcount becomes negative
So if it's already negative then we must have already queued it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if it's already negative then we must have already queued it.
Make sense, thank you for explain.
assert(refcount != 0); | ||
refcount--; | ||
_Py_atomic_store_uint32_relaxed(&op->ob_ref_local, refcount); | ||
if (refcount == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar question, do we have to handle zero local refcounting cases from _Py_DECREF_NO_DEALLOC
or it can be handled as deferred merging from somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you need _Py_MergeZeroLocalRefcount
. You can't defer it -- that would break a bunch of invariants. For example, the same thread may try calling Py_DECREF
again leading to a negative local refcount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this change will actually improve performance. In the default build, _Py_DECREF_NO_DEALLOC
can avoid a comparison and avoids emitting a function call. With Py_NOGIL
, we still need the comparison and call to _Py_MergeZeroLocalRefcount
.
In the end, I think the Py_NOGIL
version of _Py_DECREF_NO_DEALLOC
may look exactly like Py_DECREF
.
assert(refcount != 0); | ||
refcount--; | ||
_Py_atomic_store_uint32_relaxed(&op->ob_ref_local, refcount); | ||
if (refcount == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you need _Py_MergeZeroLocalRefcount
. You can't defer it -- that would break a bunch of invariants. For example, the same thread may try calling Py_DECREF
again leading to a negative local refcount.
Py_ssize_t refcount = _Py_atomic_load_ssize_relaxed(&op->ob_ref_shared); | ||
Py_ssize_t new_shared; | ||
// Shared refcount can be zero but we should consider local refcount. | ||
int should_queue = (refcount == 0 || refcount == _Py_REF_MAYBE_WEAKREF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully understand your question. Yes, shared_refcount
may be negative. should_queue
is always false if shared
is negative because:
- we only queue objects once
- we queue them the first time the shared refcount becomes negative
So if it's already negative then we must have already queued it.
I totally agree. If we consider edge cases of I close the PR, and if there is a way to improve the performance, I will re-investigate it. |
I simply implement
_Py_DECREF_NO_DEALLOC
for the free-threaded build.I assumed refcount will not be zero even after calling
_Py_DECREF_NO_DEALLOC
.Please let me know if I missed something.
--disable-gil
builds #110481