DOC: Correct docstring formatting for excel related functions GH23494 #23505

timdef · 2018-11-05T03:38:05Z

closes DOC: Fix docstrings of Excel related functions #23494
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Addresses failed validations from validate_docstrings.py in pandas.read_excel

First open source pull request! Looking to improve.

pep8speaks · 2018-11-05T03:38:07Z

Hello @timdef! Thanks for updating the PR.

In the file pandas/io/excel.py, following are the PEP8 issues :

Line 200:80: E501 line too long (83 > 79 characters)

Comment last updated on November 06, 2018 at 14:58 Hours UTC

timdef · 2018-11-05T03:46:18Z

Two things that the validation called out that I wasn't certain about:

Sometimes the parameter description ends with a bulleted list, validation would like a '.' on the last item of that list, even if the other items don't end in periods.
Deprecated parameters often don't currently have descriptions, only deprecated warnings. I had to put above warning to get validation to pass. Is this desired?

timdef · 2018-11-05T03:47:46Z

Also, still currently have these errors:

3 Errors found:
	Errors in parameters section
		Parameters {**kwds, date_parser, parse_dates} not documented
		Unknown parameters {parse_cols, sheetname, skip_footer, keep_default_na, verbose}
2 Warnings found:
	No extended summary found
	See Also section not found

Wanted to see if I should take a stab at correcting these or if this work was more about bringing things in line format-wise with the validator.

codecov · 2018-11-05T04:10:07Z

Codecov Report

Merging #23505 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #23505   +/-   ##
=======================================
  Coverage   92.23%   92.23%           
=======================================
  Files         161      161           
  Lines       51197    51197           
=======================================
  Hits        47220    47220           
  Misses       3977     3977

Flag	Coverage Δ
#multiple	`90.61% <ø> (ø)`	⬆️
#single	`42.27% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0bb24b7...a0a3f99. Read the comment docs.

gfyoung · 2018-11-05T04:26:15Z

Wanted to see if I should take a stab at correcting these or if this work was more about bringing things in line format-wise with the validator.

@timdef : First off, congrats on your first PR! I think it would be a good idea to attempt to patch these.

datapythonista

Looking good, added some comments, mainly about our conventions.

I think a See Also section, pointing to to_excel, read_csv... would be useful too.

datapythonista · 2018-11-05T06:10:13Z

pandas/io/excel.py

    data will be read in as floats: Excel stores all numbers as floats
-    internally
+    internally.

 Returns
 -------


In the next line (github doesn't let me add a comment there, as it's not modified): DataFrame or dict of DataFrame.

Then, in the examples, I'd remove the part that creates the Excel files. I'm sure the users will understand that they need to have as parameter a file that exists.

Just add after the lines with read_excel: # doctest: +SKIP so it doesn't run in the doctests (and it doesn't fail).

pandas/io/excel.py

datapythonista · 2018-11-05T06:16:38Z

pandas/io/excel.py

@@ -40,17 +40,16 @@
 _writers = {}

 _read_excel_doc = """
-Read an Excel table into a pandas DataFrame
+Read an Excel table into a pandas DataFrame.



Can you add a extended summary giving a bit more information? Questions that came to my mind users could have and would be nice to answer there are:
-Which sheet/s in the Excel file will be loaded

Which formats/versions of Excel do we support xls, xlsx?

You can mention that the file to be opened can be in the local filesystem or a url.

Anything else you thing it can be helpful for users to know about this function before reading the parameters (where these questions will be answered in many cases)

datapythonista · 2018-11-05T06:23:01Z

pandas/io/excel.py

-io : string, path object (pathlib.Path or py._path.local.LocalPath),
-    file-like object, pandas ExcelFile, or xlrd workbook.
+io : str, path object (pathlib.Path or py._path.local.LocalPath), \
+     file-like object, pandas ExcelFile, or xlrd workbook


We try to use the Python file types comma separated (I'd like to be able to parse these types). This would make more sense to me:

io : str, file descriptor, pathlib.Path, ExcelFile or xlrd.Book

In the description you can add further information if needed.

Since parameter can be a "pathlib.Path or py._path.local.LocalPath" would it make sense to make the descriptor more general ("path")? I see that the Py library is in maintenance mode, just seeking to understand going forward!

pandas/io/excel.py

datapythonista · 2018-11-05T06:37:44Z

pandas/io/excel.py

@@ -194,7 +198,7 @@
 1  string2      2
 2  string3      3

->>> pd.read_excel(open('tmp.xlsx','rb'))
+>>> pd.read_excel(open('tmp.xlsx', 'rb'))


I'd remove this example, may be you can show one with sheet_name='Sheet3' instead?

Co-Authored-By: timdef <[email protected]>

…strings

…into excel-related-docstrings

datapythonista

@timdef can you update based on the new and previous comments please?

datapythonista · 2018-11-09T15:13:45Z

pandas/io/excel.py

    a  b
 0   1  2
 1  #2  3

->>> pd.read_excel('tmp.xlsx', comment='#')
+>>> pd.read_excel('tmp.xlsx', comment='#') # doctest: +SKIP


I'd remove these two last examples. It assumes that the content of tmp.xlsx is different than before, which makes things trickier and not very easy to follow. And I don't think it's a very important use case.

datapythonista · 2018-11-09T15:14:22Z

pandas/io/excel.py


 Returns
 -------
-parsed : DataFrame or Dict of DataFrames
+parsed : DataFrame or dict of DataFrame


Suggested change

parsed : DataFrame or dict of DataFrame

DataFrame or dict of DataFrame

datapythonista · 2018-11-09T15:14:40Z

pandas/io/excel.py


 Returns
 -------
-parsed : DataFrame or Dict of DataFrames
+parsed : DataFrame or dict of DataFrame
    DataFrame from the passed in Excel file.  See notes in sheet_name


Suggested change

DataFrame from the passed in Excel file. See notes in sheet_name

DataFrame from the passed in Excel file. See notes in sheet_name

jreback · 2018-11-11T16:30:30Z

can you merge master, pandas.io.excel was refactored a bit

datapythonista · 2018-11-21T01:29:36Z

@timdef do you have time to address the comments and fix the conflicts?

jreback · 2018-12-09T20:14:00Z

closing as stale. if you'd like to continue working, pls ping.

timdef added 8 commits November 4, 2018 17:15

Correct io parameter.

e084881

Fix sheetname and sheet_name parameter.

d885434

Correct squeeze, and engine.

a666d61

Add missing periods to parameter descriptions.

bca5441

Correct type from boolean to bool.

71e2e44

Add description to skip_footer, correct convert_float.

cd9172c

Correct linting errors in doctests.

901b06c

Fix short summary, equal spacing around warnings.

c8134e2

gfyoung added Docs IO Excel read_excel, to_excel labels Nov 5, 2018

gfyoung requested a review from datapythonista November 5, 2018 04:26

datapythonista reviewed Nov 5, 2018

View reviewed changes

timdef and others added 13 commits November 5, 2018 05:18

Correct return type.

da632e4

Update pandas/io/excel.py

b8f38f1

Co-Authored-By: timdef <[email protected]>

Update pandas/io/excel.py

430eabe

Co-Authored-By: timdef <[email protected]>

Update pandas/io/excel.py

2bab09b

Co-Authored-By: timdef <[email protected]>

Update pandas/io/excel.py

1476854

Co-Authored-By: timdef <[email protected]>

Update pandas/io/excel.py

eef668c

Co-Authored-By: timdef <[email protected]>

Update pandas/io/excel.py

8b4c26d

Co-Authored-By: timdef <[email protected]>

Update pandas/io/excel.py

5b3ef8e

Co-Authored-By: timdef <[email protected]>

Update pandas/io/excel.py

bc9334d

Co-Authored-By: timdef <[email protected]>

Merge remote-tracking branch 'upstream/master' into excel-related-doc…

b288ff8

…strings

Merge branch 'excel-related-docstrings' of github.com:/timdef/pandas …

b0922c3

…into excel-related-docstrings

Remove excel file creation from examples.

9f0d2a7

Add extended summary.

fdd1494

timdef added 2 commits November 6, 2018 06:44

Clean up io parameter

473d077

Cleaned up sheetname parameter.

a0a3f99

datapythonista reviewed Nov 9, 2018

View reviewed changes

jreback closed this Dec 9, 2018

	parsed : DataFrame or dict of DataFrame
	DataFrame or dict of DataFrame

	DataFrame from the passed in Excel file. See notes in sheet_name
	DataFrame from the passed in Excel file. See notes in sheet_name

Uh oh!

DOC: Correct docstring formatting for excel related functions GH23494 #23505

DOC: Correct docstring formatting for excel related functions GH23494 #23505

Uh oh!

Conversation

timdef commented Nov 5, 2018 • edited by datapythonista Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pep8speaks commented Nov 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated on November 06, 2018 at 14:58 Hours UTC

Uh oh!

timdef commented Nov 5, 2018

Uh oh!

timdef commented Nov 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gfyoung commented Nov 5, 2018

Uh oh!

datapythonista left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

datapythonista left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback commented Nov 11, 2018

Uh oh!

datapythonista commented Nov 21, 2018

Uh oh!

jreback commented Dec 9, 2018

Uh oh!

Uh oh!

timdef commented Nov 5, 2018 •

edited by datapythonista

Loading

pep8speaks commented Nov 5, 2018 •

edited

Loading

timdef commented Nov 5, 2018 •

edited

Loading

codecov bot commented Nov 5, 2018 •

edited

Loading