-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
pd.Series.reindex is not thread safe. #25870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
virtually no pandas functions are threadsafe, becuase .copy() is not, see #2728 |
Not very satisfactory - especially for non-mutating operations. Since the bug you referenced is still open, could we keep this one open. |
so we will have 1 more issue, what's the purpose? this is a duplicate issue |
- its not entirely obvious it is the same bug. In the bug you referenced,
the issue is that .copy is not safe because some-one might be writing to
the data at the copy (which I would not have assumed to be thread safe). In
my bug, no-one is writing to the data, its still not safe. I would have
assumed that operations returning new objects, and not obviously changing
the original to be safe unless otherwise noted -- they're not. Is .copy the
cause? Maybe. I don't see why though.
- when a bug is closed, the failure cases should be unit tests to prevent
regression.
|
your are welcome to submit a PR if you want to provide a test this is a duplicate of an unfixed issue we have 2900 issue - would welcome help doing things here - sure reporting bugs is great but pandas is all volunteer for anything else |
Would love, too.
If I had any idea why reading from an object that no-one is writing to
would not be safe, I'd think about fixing it.
… |
Uh oh!
There was an error while loading. Please reload this page.
Code Sample, a copy-pastable example if possible
Problem description
pd.Series.reindex fails in a multi-threaded application.
This is a little surprising since I'm not asking for any writes.
The error also seems bogus: 'cannot reindex from a duplicate axis' ... the series does not have any duplicate axis and I was able to call
s.reindex(idx)
in the main thread before the same failed in the pool's thread.Expected Output
Program should output nothing.
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: