Skip to content

Esiegerman/summary colors #776

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 3, 2015

Conversation

esiegerman
Copy link

Recreating this PR on Github. No change from the Bitbucket version, except that it's been rebased against master -- so there's nothing new to review until I update it.

@nicoddemus
Copy link
Member

Thanks, taking the liberty to add a link to the original PR: https://bitbucket.org/pytest-dev/pytest/pull-request/304

@nicoddemus nicoddemus mentioned this pull request Jun 16, 2015
6 tasks
# suffice

# Important statuses -- the highest priority of these always wins
("red", "1 failed", {"failed": (1,)}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use pep8

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. the spacing, you mean? Will do. (A previous change of mine was accepted looking like this, so I did the same this time.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes spacing, it's not per pep8
not sure why new code should break pep8

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, because, as PEP8 itself says:

Some other good reasons to ignore a particular guideline:

  1. When applying the guideline would make the code less readable, even for someone who is used to reading code that follows this PEP.

IMO, code that's essentially a big table is more readable when it's formatted as such.

I wasn't sure what pytest's conventions are, so I submitted the code in the format I prefer. If that isn't what pytest prefers, I'm happy to change it -- but I'd rather not do that until I'm finished with it, i.e. until everything else about the PR has been approved.

@nicoddemus
Copy link
Member

I'm 👍 on the idea behind this PR, btw. 😄

("red", "1 error", {"error": (1,)}),
("red", "1 passed, 1 error", {"error": (1,), "passed": (1,)}),

("red", "1 xpassed", {"xpassed": (1,)}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be "yellow"?

@esiegerman
Copy link
Author

Propagating comments from Bitbucket...

@hpk42 wrote:

the practical reason for xpass not causing red is that xfail is somewhat often used for flaky tests.

To which I responded:

Fair enough. I haven't used it that way -- only for hard failures -- but I see your point. (Indeed, I have a couple of tests that this would make sense for.)

How would you feel about a way to distinguish intermittent/flaky tests from those that are expected to fail every time (e.g. tests for bugs that haven't been fixed yet). Could do this either as an argument to xfail, or as a new marker entirely. My preference would be the latter; known-broken test and flaky test feel to me like different things, rather than variations of the same thing.

If the idea works for you, I'll dive deeper into pytest to implement it.

In the meantime, I agree; if people are using xfail for flaky tests, red is too serious for xpassed. I can justify arguments for any of yellow, green, or "boring" (i.e. has no effect on the colour). What's your preference?

If folks like the idea of separate markers for "fails intermittently" vs. "fails always":

  • Which of those should xfail (continue to) be used for? The documentation says it's for tests that are "expected to fail", which I take to mean "fails always". But if in practice it's used more often for tests that fail intermittently, that could be an argument for redesignating xfail as such and updating the docs accordingly
  • I'd appreciate suggestions for naming the other one

@The-Compiler
Copy link
Member

I agree there should be some difference between "is expected to fail" and "is flaky".

I personally use xfail for "I found a bug while writing tests, or found a bug and wrote a test for it - but I don't want to fix it immediately, and I don't want my CI failing because of it". (I don't have any flaky tests, fortunately)

IMHO, xpassed should be red, and a new pytest.mark.flaky marker should be added - then it's also immediately clear for someone reading the tests that this test is flaky, not expected to fail.

@nicoddemus
Copy link
Member

IMHO, xpassed should be red, and a new pytest.mark.flaky marker should be added - then it's also immediately clear for someone reading the tests that this test is flaky, not expected to fail.

I like this idea... I have used xfail exclusively for flaky tests currently, but a flaky mark would be more appropriate.

Backward compatibility aside: Supposing then that we have two different marks, xfail and flaky, what would happen if an xfail test started to pass? That should count as a failure I think, because that might be a red flag (no pun intended) that something is amiss, or the expected failure was actually fixed so the test can have the xfail mark removed.

Backward compatibility back on: unfortunately we can't change the current meaning of xfail regarding test suite failure/success, so my proposal in the previous paragraph wouldn't be possible.

Perhaps a new mark is out of the scope of this PR, and we should just agree that xfail as it stands right now actually mean both "will always fail because of reasons" and "flaky", and decide which color to assign to it? I vote for yellow. 😁

@esiegerman
Copy link
Author

Perhaps a new mark is out of the scope of this PR

Agreed.

I think changing xpass/xfail behaviour is also out of scope. I thought that making xpass red was a trivial change -- didn't realize the larger issues involved. So I'll factor that out of this PR.

@esiegerman
Copy link
Author

[ OK, I've pretty much started from scratch. I wasn't sure how this project feels about git push -f on a still-being-reviewed PR branch, so I git reverted my old commits instead. That can be cleaned up prior to merging. ]

In this version, xpass is treated the same as xfailed, deselected and skip: it's ignored when deciding what color the summary bar should be. Thus, if you have "5 passed, 3 xfailed, 1 xpassed", you get a green bar, but if you only have "3 xfailed, 1 xpassed", you get yellow. Rationale: since the meaning of xpass is indeterminate (see earlier discussion in this PR), there's no obvious correct color to give it -- green is no better than red, since either one might be a lie.

I've also backed the implementation off to the old one, much more lightly modified, at @hpk42's request.

I have not yet PEP8'ified the test cases. I'll do that just before merge; as long as we're still hashing things out, I find the nicely columnar format a lot easier to work with.

@nicoddemus
Copy link
Member

OK, I've pretty much started from scratch. I wasn't sure how this project feels about git push -f on a still-being-reviewed PR branch, so I git reverted my old commits instead. That can be cleaned up prior to merging.

We have not discussed this in detail, but I think it is OK if you're the only person working on the branch. But please do rebase it/clean up the commits once we are ready to merge. 😄

I have not yet PEP8'ified the test cases. I'll do that just before merge; as long as we're still hashing things out, I find the nicely columnar format a lot easier to work with.

Seems fair enough to me! 😄

@nicoddemus
Copy link
Member

Looks 👍 to me. 😄

@The-Compiler
Copy link
Member

Looks good to me as well!

@esiegerman esiegerman force-pushed the esiegerman/summary_colors branch from 5493562 to 7b989e6 Compare July 2, 2015 17:33
Eric Siegerman added 8 commits July 2, 2015 13:39
--HG--
branch : esiegerman/summary_colors
This makes it easier to identify failing tests.
Check for the empty-key special case in the first loop,
not the second.
Also if we see any statuses the code doesn't know about.
Passing tests override that default, making the color green; but several other
"boring" statuses (xfailed, xpassed, deselected, skipped) have no effect.

Net effect: if only "boring" tests are seen, or no tests at all, the summary
bar is yellow.
@esiegerman esiegerman force-pushed the esiegerman/summary_colors branch from 7b989e6 to afcad74 Compare July 2, 2015 17:41
@esiegerman
Copy link
Author

Junk commits removed; remaining commits slightly reorganized; PEP8ified; rebased. No functional change.

@nicoddemus
Copy link
Member

Many thanks @esiegerman! 😄

One last request (sorry for not mentioning it earlier): Please add yourself to AUTHORS and add a note to the CHANGELOG describing this change.

I will merge this shortly if no one opposes this.

@esiegerman
Copy link
Author

Done. I've called it only a partial fix for #500, as that issue's OP asked in a comment for an explicit warning message (as well as the color change).

@The-Compiler The-Compiler mentioned this pull request Jul 3, 2015
@nicoddemus nicoddemus merged commit 2c419c4 into pytest-dev:master Jul 3, 2015
@nicoddemus
Copy link
Member

Merged, thanks again @esiegerman! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants