-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Pandas can't handle zipfile.Path objects (ValueError: Invalid file path or buffer object type: <class 'zipfile.Path'>) #49906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pandas works with all Since the python documentation of |
Thanks for your analysis. |
Here Line 254 in 61e0db2
|
I'm not familiar enough with what |
That is a good question. Let me explain. Let's say you have a zip-file containing three "data" files (2x csv, 1x excel). I would like to read them like this:
I have kind of a pandas-wrapper ( |
Thank you for describing the use case! In that case, it doesn't make sense to convert it to a The reason why with fpa.open('r') as handle:
df = pandas.read_csv(handle) # might need to specify mode="rb" |
That is exactly what I'm doing here in my workaround. But other file reading libraries that I've tested doesn't have problems with |
I think an option to accommodate that would be:
|
This is reproducible in current latest Pandas
1.5.2
.In Python the
zipfile.Path
class is intendent to act similar (but not absolute equal!) topathlib.Path
. The latter is accepted bypandas
but not the first.Steps to reproduce:
foo.zip
with one an csv-file in it namedbar.csv
.zp = zipfile.Path('foo.zip', 'bar.csv')
zp
) inpandas.read_csv()
as path object.Because of that part of your code
pandas/pandas/io/common.py
Lines 446 to 452 in 3b09765
Python raise an " ValueError: Invalid file path or buffer object type: <class 'zipfile.Path'>".
EDIT:
I'm aware that
pandas.read_csv()
do offer thecompressions
argument and can read compressed csv files by its own. But this doesn't help in my case. I'm usingpandas
as a backend for a more higher level API reading data files. Pandas is just one part of it. And one shortcoming of pandas here is that it is not able to deal with ZIP files containing multiple CSV files.pathlib.Path
andzipfile.Path
are standard python. And pandas IMHO should be able to deal with it.The text was updated successfully, but these errors were encountered: