-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: Correct docstring formatting for excel related functions GH23494 #23505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @timdef! Thanks for updating the PR.
Comment last updated on November 06, 2018 at 14:58 Hours UTC |
Two things that the validation called out that I wasn't certain about:
|
Also, still currently have these errors:
Wanted to see if I should take a stab at correcting these or if this work was more about bringing things in line format-wise with the validator. |
Codecov Report
@@ Coverage Diff @@
## master #23505 +/- ##
=======================================
Coverage 92.23% 92.23%
=======================================
Files 161 161
Lines 51197 51197
=======================================
Hits 47220 47220
Misses 3977 3977
Continue to review full report at Codecov.
|
@timdef : First off, congrats on your first PR! I think it would be a good idea to attempt to patch these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, added some comments, mainly about our conventions.
I think a See Also
section, pointing to to_excel
, read_csv
... would be useful too.
data will be read in as floats: Excel stores all numbers as floats | ||
internally | ||
internally. | ||
|
||
Returns | ||
------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next line (github doesn't let me add a comment there, as it's not modified): DataFrame or dict of DataFrame
.
Then, in the examples, I'd remove the part that creates the Excel files. I'm sure the users will understand that they need to have as parameter a file that exists.
Just add after the lines with read_excel
: # doctest: +SKIP
so it doesn't run in the doctests (and it doesn't fail).
@@ -40,17 +40,16 @@ | |||
_writers = {} | |||
|
|||
_read_excel_doc = """ | |||
Read an Excel table into a pandas DataFrame | |||
Read an Excel table into a pandas DataFrame. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a extended summary giving a bit more information? Questions that came to my mind users could have and would be nice to answer there are:
-Which sheet/s in the Excel file will be loaded
- Which formats/versions of Excel do we support
xls
,xlsx
? - You can mention that the file to be opened can be in the local filesystem or a url.
Anything else you thing it can be helpful for users to know about this function before reading the parameters (where these questions will be answered in many cases)
pandas/io/excel.py
Outdated
io : string, path object (pathlib.Path or py._path.local.LocalPath), | ||
file-like object, pandas ExcelFile, or xlrd workbook. | ||
io : str, path object (pathlib.Path or py._path.local.LocalPath), \ | ||
file-like object, pandas ExcelFile, or xlrd workbook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to use the Python file types comma separated (I'd like to be able to parse these types). This would make more sense to me:
io : str, file descriptor, pathlib.Path, ExcelFile or xlrd.Book
In the description you can add further information if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since parameter can be a "pathlib.Path or py._path.local.LocalPath" would it make sense to make the descriptor more general ("path")? I see that the Py library is in maintenance mode, just seeking to understand going forward!
pandas/io/excel.py
Outdated
@@ -194,7 +198,7 @@ | |||
1 string2 2 | |||
2 string3 3 | |||
|
|||
>>> pd.read_excel(open('tmp.xlsx','rb')) | |||
>>> pd.read_excel(open('tmp.xlsx', 'rb')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove this example, may be you can show one with sheet_name='Sheet3'
instead?
Co-Authored-By: timdef <[email protected]>
Co-Authored-By: timdef <[email protected]>
Co-Authored-By: timdef <[email protected]>
Co-Authored-By: timdef <[email protected]>
Co-Authored-By: timdef <[email protected]>
Co-Authored-By: timdef <[email protected]>
Co-Authored-By: timdef <[email protected]>
Co-Authored-By: timdef <[email protected]>
…into excel-related-docstrings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timdef can you update based on the new and previous comments please?
a b | ||
0 1 2 | ||
1 #2 3 | ||
|
||
>>> pd.read_excel('tmp.xlsx', comment='#') | ||
>>> pd.read_excel('tmp.xlsx', comment='#') # doctest: +SKIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove these two last examples. It assumes that the content of tmp.xlsx
is different than before, which makes things trickier and not very easy to follow. And I don't think it's a very important use case.
|
||
Returns | ||
------- | ||
parsed : DataFrame or Dict of DataFrames | ||
parsed : DataFrame or dict of DataFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parsed : DataFrame or dict of DataFrame | |
DataFrame or dict of DataFrame |
|
||
Returns | ||
------- | ||
parsed : DataFrame or Dict of DataFrames | ||
parsed : DataFrame or dict of DataFrame | ||
DataFrame from the passed in Excel file. See notes in sheet_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataFrame from the passed in Excel file. See notes in sheet_name | |
DataFrame from the passed in Excel file. See notes in sheet_name |
can you merge master, |
@timdef do you have time to address the comments and fix the conflicts? |
closing as stale. if you'd like to continue working, pls ping. |
git diff upstream/master -u -- "*.py" | flake8 --diff
Addresses failed validations from validate_docstrings.py in pandas.read_excel
First open source pull request! Looking to improve.