-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fix for issue #21150, using a simple lock to prevent an issue with multiple threads accessing an Index #22006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,7 @@ dependencies: | |
- beautifulsoup4 | ||
- bottleneck | ||
- dateutil | ||
- futures | ||
- gcsfs | ||
- html5lib | ||
- jinja2=2.8 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,7 @@ dependencies: | |
- cython=0.28.2 | ||
- fastparquet | ||
- feather-format | ||
- futures | ||
- gcsfs | ||
- html5lib | ||
- ipython | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,6 +23,18 @@ from pandas._libs import algos, hashtable as _hash | |
from pandas._libs.tslibs import Timestamp, Timedelta, period as periodlib | ||
from pandas._libs.missing import checknull | ||
|
||
# Python 2 vs Python 3 | ||
try: | ||
from thread import allocate_lock as _thread_allocate_lock | ||
except ImportError: | ||
try: | ||
from _thread import allocate_lock as _thread_allocate_lock | ||
except ImportError: | ||
try: | ||
from dummy_thread import allocate_lock as _thread_allocate_lock | ||
except ImportError: | ||
from _dummy_thread import allocate_lock as _thread_allocate_lock | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to move this into There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not just import it from tslibs.strptime where it is already? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In that case, we should move that import code to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you address this? We also recently fixed all our bare excepts. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @TomAugspurger fixed the bare excepts (had a look at recent commits and saw one where many bare excepts were fixed, kudos!!!). As for the part about moving to Pushing a change with the fix for bare excepts. But let me know if I should update where it is imported. Thanks There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds fine, I didn't read through all the comments :) |
||
cdef int64_t iNaT = util.get_nat() | ||
|
||
|
||
|
@@ -53,6 +65,9 @@ def get_value_box(arr: ndarray, loc: object) -> object: | |
# Don't populate hash tables in monotonic indexes larger than this | ||
_SIZE_CUTOFF = 1000000 | ||
|
||
# Used in _ensure_mapping_populated to ensure is_unique behaves correctly | ||
# in multi-threaded code, see gh-21150 | ||
_mapping_populated_lock = _thread_allocate_lock() | ||
|
||
cdef class IndexEngine: | ||
|
||
|
@@ -236,17 +251,17 @@ cdef class IndexEngine: | |
|
||
cdef inline _ensure_mapping_populated(self): | ||
# this populates the mapping | ||
# if its not already populated | ||
# if it is not already populated | ||
# also satisfies the need_unique_check | ||
|
||
if not self.is_mapping_populated: | ||
|
||
values = self._get_index_values() | ||
self.mapping = self._make_hash_table(len(values)) | ||
self._call_map_locations(values) | ||
with _mapping_populated_lock: | ||
if not self.is_mapping_populated: | ||
values = self._get_index_values() | ||
self.mapping = self._make_hash_table(len(values)) | ||
self._call_map_locations(values) | ||
|
||
if len(self.mapping) == len(values): | ||
self.unique = 1 | ||
if len(self.mapping) == len(values): | ||
self.unique = 1 | ||
|
||
self.need_unique_check = 0 | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.