-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: .read_pickle(...) from zip containing hidden OS X/macOS metadata files/folders #37101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
46fce7c
to
bd7a78a
Compare
bd7a78a
to
d89aafb
Compare
d89aafb
to
73ac7ca
Compare
I cannot comment on whether this PR is 'desirable' :) I think it it should also work for |
I wouldn't want to argue particularly hard for it myself! I guess it depends how prevalent this "problem" is, but I won't be too upset if this doesn't make it in 🙃
It does indeed work with the Python engine, the tests in I'll hold off until there's been a discussion about whether this is desirable (either here or in the issue #37098). |
I wouldn't wait for my PR (I'm not sure how long it will take me to debug two failing tests on windows). I will ping you when/if the PR gets merged. Including test for the python engine would be good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only an issue on read? Not write?
pandas/tests/io/test_pickle.py
Outdated
) as p2, tm.ensure_clean(dummy) as d: | ||
df = tm.makeDataFrame() | ||
|
||
# write to uncompressed file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove all of these comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really? I'm just following the style of other tests in this file. I've made the comments more succinct but wouldn't want to remove them altogether as this is a bit of a weird edge-case to test.
I think it's only a problem when the zip has been made inside the macOS file manager, rather than say, at the command-line (though I do not have a Mac to test this). As I said before, maybe this is a non-issue. |
This seems to be the case, thanks @twoertwein! |
Happy to close this if there's no interest; will re-ping the initial reviewers just in case. |
ive tried, but have been unable to form an opinion |
a more generic approach might be to use Technically this can already be achieved with from zipfile import ZipFile
import pandas as pd
with ZipFile("test.zip", mode="r") as file:
with file.open("data.pickle", mode="r") as file:
pd.read_pickle(file) |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
This PR allows for
.zip
files created by OS X and macOS that contain__MACOSX
and.DS_STORE
metadata folders to be loaded byread_pickle/read_csv/read_table/read_json
without error.It does not work withpd.read_csv(...)
as in that case the compression is handled in the C code. If this enhancement is deemed desirable then I'm willing to have a go at writing it (and the test is already written).(No longer true as of #36997)
Other similar folders may exist from other operating systems, in which case the list could be pulled out as a constant which could be used in the tests and in the module itself.