Skip to content

Split autoformatters and linters into different workflows and CI jobs #5167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pmeier opened this issue Jan 6, 2022 · 5 comments
Open

Comments

@pmeier
Copy link
Collaborator

pmeier commented Jan 6, 2022

Status quo

Currently we mix formatters and linters with pre-commit:

formatters

- id: mixed-line-ending
args: [--fix=lf]
- id: end-of-file-fixer

- id: ufmt
additional_dependencies:
- black == 21.9b0
- usort == 0.6.4

linters

- id: check-docstring-first
- id: check-toml
- id: check-yaml
exclude: packaging/.*

- id: flake8
args: [--config=setup.cfg]

- id: pydocstyle

In addition we have a separate CI job that lints with mypy

type_check_python:

Proposal

I propose we change the rationale from the two CI jobs from pre-commit and mypy to code format and lint. That means, we would only keep autoformatters in our pre-commit configuration and move all linters to what is currently the mypy job.

Pros

  • We could use pre-commit as "single source of truth" for formatting code. Currently they mentioned as "purely optional" in our contribution guide. Since pre-commit supports running the hooks manually, we can simply treat it as our way to bundle all autoformatters while the user does not need to know or care what exactly is run.
  • We could simply add new autoformatters if they are available through pre-commit, e.g. add prettier as non-code auto formatter #5158. If anyone applies code formatting through pre-commit anyway, that won't break any workflow.

Cons

  • We would loose the ability to run linters in a bundled manner. I don't think this is a strong con, since linters such as flake8 or pydocstyle trigger very seldom anyway due to the auto-formatting. Plus, we already need to run mypy separately.

cc @seemethere

@datumbox
Copy link
Contributor

datumbox commented Jan 7, 2022

Does it mean that the linters will run only via pre-commit instead of manually?

What I liked in the original proposal when you introduced formatting was the fact that you didn't force upon users a specific workflow aka using pre-commit. I personally prefer submitting these commands manually, after I'm done and committed my code changes and then review independently any automatic changes done by flake8, black etc.

@pmeier
Copy link
Collaborator Author

pmeier commented Jan 7, 2022

Does it mean that the linters will run only via pre-commit instead of manually?

Partially yes, but this is already true today. The following hooks cannot be run without pre-commit:

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.0.1
hooks:
- id: check-docstring-first
- id: check-toml
- id: check-yaml
exclude: packaging/.*
- id: mixed-line-ending
args: [--fix=lf]
- id: end-of-file-fixer

For everything else, you can of course also run manually. It gets a little trickier if you have formatters that cannot be installed by pip. For example prettier in #5158 needs a JS environment. So you would need to set it up manually. pre-commit handles this automatically.

I personally prefer submitting these commands manually, after I'm done and committed my code changes and then review independently any automatic changes done by flake8, black etc.

flake8 is a linter and as proposed above it will be removed from our pre-commit config.

Regarding black I'm not sure what there is to review. It guarantees AST equality, so there should never be a situation where a manual review should be needed.

@datumbox
Copy link
Contributor

datumbox commented Jan 7, 2022

flake8 often picks up import problems that black doesnt. I'm pretty use there are other things that are not covered by black, but don't remember specifics.

I'm skeptical about forcing upon developers pre-commit. I personally don't use it and I don't like the idea of having changes automatically done on my code silently without the ability to review them. We had issues in the past with tools that rewrote the python code that their outcome conflicted with JIT. If those rewrites are done on-commit, it's going to be a hell to figure out what happened and where.

@pmeier
Copy link
Collaborator Author

pmeier commented Jan 7, 2022

flake8 often picks up import problems that black doesnt.

That is not true (EDIT: I missed the word "import" in @datumbox's statement. I agree, unused imports is one of the few things that ufmt will not handle). Especially in combination with usort, there are very few things that are left for it to pick up. Quoting from our contribution guidelines:

Similarly, you can check for flake8 errors with flake8 torchvision, although they should be fairly rare considering that most of the errors are automatically taken care of by ufmt already.

It basically comes down to stuff that cannot be fixed automatically, e.g. from foo import *.

I personally don't use it and I don't like the idea of having changes automatically done on my code silently without the ability to review them.

That can't happen. If you have the hooks installed, git commit something, and hook fails, it is as if the git commit has never happened. You will see all changes made by the hooks as unstaged changes that you need to manually stage again. Only if no hook fails, git commit is actually executed.

Besides, just using the pre-commit framework does not mean you have to use the hooks. You can simply do pre-commit run, which will execute all formatters on the staged files. If you don't pre-commit install, you will have no changes whatsoever to the git behavior. Thus, without installing the hooks, pre-commit basically acts as grouper for all formatters while also handling setting them up in their own environments.

@NicolasHug
Copy link
Member

flake8 often picks up import problems that black doesnt.

That is not true.

From personal experience the one thing that black doesn't pick up is xyz imported but not used. This happens often when refactoring (or just writing) code.

I see value in allowing pre-commits, but I wouldn't make this mandatory either. It's just sugar on top of the more low-level tools, and I think it's important to keep them. To answer a question from @pmeier from another thread #5168 (comment)

Does a contributor need to know for example how to run all linters manually?

IMHO yes, if only for educational purpose. It's important that they know what's happening under the hood, for example to resolve errors more efficiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants