-
-
Notifications
You must be signed in to change notification settings - Fork 170
Disable GL01 check in numpydoc by default. #241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for investigating this @r-build ! |
@datapythonista can you weigh in? FWIW I also disabled this when I migrated the MNE-Python checks. |
Can't look at the links now, but if the standard is having the text in the same line of the quotes, my preference is to make the rule validate that this is the case. We'll have to change hundreds of them in pandas, but better in the long term than doing tricky things. For disabling things, I wouldn't have that in |
To be correct, IMO we didn't make a "wrong" choice in pandas, we just made "a" choice, since the standards are (explicitly) ambiguous. PEP257 says: "The summary line may be on the same line as the opening quotes or on the next line", and the numpydoc standard does not say it explicitly, but has conflicting examples (the main example explaining that section has it on the same line, the example for a class docstring on the next line). My preference would be that we don't need to change it all again in pandas. But that probably means leaving this as an option for the users (using projects), which will complicate the check in numpydoc? |
Indeed, looks like both choices are possible according to the PEP257. This was initially raised because this rule was producing a large number of cosmetic changes in scikit-learn (scikit-learn/scikit-learn#15444 (comment)), but I understand that changing it the other way around in pandas would be annoying as well. Maybe we could for instance,
cc @jnothman |
I agree the user should opt.
There are other stylistic variations within the numpydoc community which I
don't think are checked by this validator, like the use of a blank line
between parameter definitions.
|
For unambiguous checks, I agree with @datapythonista that "I want to ignore this" can and should be implemented by the end user:
But for ambiguous checks where there are variants, it might make sense to add complications to the validator function to say which standard you want to follow. So for example we might want Currently in MNE we are just ignoring it, but generally adhere to the "same" policy, so it would be great if we could remove it from our ignore list and pass The only other option I can think of that would work would be to allow conflicting standards, e.g. a GL00 that says "Title must start on the same line as triple quotes" along side the GL01 saying it must start on the following line -- then sklearn/MNE can ignore GL01 and pandas ignores GL00. But I don't like that quite as much as it forces a user to always have to ignore something (no docstring can ever pass without some explicit ignore). |
... actually thinking about it more and looking around a bit, I'm thinking the conflicting GL00/GL01 (or whatever numbers) isn't such a bad option. For example it exists in
In In the command-line case we can have a default |
I think we can probably agree in one of the conventions for all projects. I guess there are just historical reasons to have two conventions, and I think in the long run we'll benefit from just having one. The convention docs will be simpler, and people contributing in several projects will save some neurons on not having to remember which format is used in each project. In pandas we use: def foo():
"""
Not in the line of the quotes.
""" That was after some discussion on what people find more readable I think. And I'd say we also wanted to have 3 extra characters for the one liner summary. Is there any reason for starting with the quotes, besides being the convention currently followed? My preference is the pandas convention, since it was our preference and we spent significant resources changing those. But if everybody else prefer the opposite, I prefer to have one convention and change again the pandas docstrings. |
Why not allow both options, as specified deliberately by PEP-257? (and the google python style guide) To me, restricting to one or the other seems like a choice that doesn't need to be made. Regarding triple quotes for one-liners, PEP 257 says "Triple quotes are used even though the string fits on one line. This makes it easy to later expand it." |
Depends probably on who you mean by "we" here. I'm happy to speak for one or two projects, but even choosing one for e.g. SciPy would be a pain, let alone trying to dictate it for multiple projects. For SciPy, being able to check "it fits GL00 or GL01" is better than only "it fits GL01" or "it fits GL02" because it would be a pain to transition everything, and might not ever actually get done. I agree one standard would be nice, but I think it's going to be very difficult to unify, and get people to transform their code. Different repos already implement different flavors and strictness via To me the simplest thing is just to implement both, push one (by enabling it by default and disabling the other by default in the command line thing), but allow people to do either. If others disagree and think we should force one standard (or even if we go with the "allow both" option, we need to choose the default), one reason for "start on the line with the quotes" would be for consistency with pydocstyle |
I still feel that having two conventions is like having some countries driving on the right, and other driving on the left. It's an arbitrary decision, and while for few weeks we'll have a bit more work standardizing, deciding a convention now, will make our lives easier in the future. If projects disagree one option is to have both checks as you say. But having to always ignore a rule is not great. Or having to specify a Not sure if I prefer to simply get rid of that check, and I'll move it to pandas to the custom checks. |
I think the burden for making this shift community-wide is pretty big compared to our cost of maintaining two GL0X's in numpydoc (and people needing to know, for a given project, which standard to follow). Under the analogy, there are already countries where people drive on the left, some on the right, and some on both sides. Forcing a particular standard is like telling England they can either start driving on the right, or we will not supply them with standardized traffic direction designations / checks because we have decided drive-on-the-right is the way to go.
I agree, I've come around to thinking that having two conflicting checks that are easily filtered (since most repos will do filtering via a
This to me seems worse than enabling both checks. Instead of having each repo (or really the half that doesn't ascribe to what we decide should be the standard There is precedence for having conflicting checks (e.g., W503, W504) and giving people a way around it (choosing a default Also @jnothman @rgommers if you have opinions here it might help us figure out the way to go. |
Could the GL01 check in numpydoc be dsiabled by default, please?
It contradicts the examples at the numpydoc docstring guide, and also PEP-257
where it says "The summary line may be on the same line as the opening quotes or on the next line.".
The code seems to be in numpydoc/validate.py :
"GL01": "Docstring text (summary) should start in the line immediately " "after the opening quotes (not in the same line, or leaving a " "blank line in between)",
and
if doc.start_blank_lines != 1 and "\n" in doc.raw_doc: errs.append(error("GL01"))
The text was updated successfully, but these errors were encountered: