-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Refactor tests for ops #6027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor tests for ops #6027
Conversation
Thanks @oke-aditya , Before moving large section of tests into separate files, I think it might be worth reaching consensus in #6020 first. I'm a bit wary about such changes as they're really not as free as they might seem. |
|
||
def test_ciou_jit(self) -> None: | ||
self._run_jit_test([[0, 0, 100, 100], [0, 0, 50, 50], [200, 200, 300, 300]]) | ||
def test_float16_box(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably make stuff a bit compact with parametrize with pytest.params. Hmm but I was like it's ok this isn't so bad?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why this check is special cases and not grouped with the other float dtypes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure probably to check overflow from float16 to float32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please check if the test above works with float16 and potentially remove this one?
Yes @NicolasHug I get your point. The idea is
Let's finish reviewing and adjusting the code, i will eventually move back code to same file. I took it out so that I could analyse it independently and made it slightly easier to run tests. We won't create this file structure at end of PR. |
test/test_losses.py
Outdated
loss = iou_fn(box1, box2, reduction="mean") | ||
loss.backward() | ||
tol = 1e-3 if dtype is torch.half else 1e-5 | ||
torch.testing.assert_close(loss, torch.tensor(0.0, device=device), rtol=tol, atol=tol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, we want the loss to be 0
if no boxes, i.e. boxes with a degenerate shape, were passed and we use reduction="mean"
? But if we set reduction="none"
the loss should be empty?
|
||
def test_ciou_jit(self) -> None: | ||
self._run_jit_test([[0, 0, 100, 100], [0, 0, 50, 50], [200, 200, 300, 300]]) | ||
def test_float16_box(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why this check is special cases and not grouped with the other float dtypes?
I will get back to this on Tuesday. I'm shifting homes this week 🏡. |
I think this weekend might give me some time. :( |
@oke-aditya No worries. We thankfully managed to merge the PR that cleaned up the code prior the release. This is less urgent. |
@oke-aditya Do you need help finishing this? I have some dedicated time this afternoon (UTC +2 time) so I can review the code and/or help finishing it. Let me know. 👌 |
Hey I'm back on this. Was about to give time on this today. I have finally settled in new (or perhaps old?) home town 😄 Feel free to help in reviewing as Philip is on PTO. |
I will. Good luck with what remains. 👍 |
Hi @NicolasHug , @datumbox , @YosuaMichael could someone please review the PR as Philip is having a well deserved PTO :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @oke-aditya! I had a quick look and looks an improvement. I got only a few nits.
Also splitting the tests in separate files is definitely worth it IMO, but I would expect that the box op tests would all be together, the losses all together etc. Definitely not a blocking issue for me and we can review the structure on a future PR.
test/test_ops.py
Outdated
dtype=torch.float16, | ||
) | ||
expected = torch.tensor([605113.875, 600495.1875, 592247.25], dtype=torch.float32) | ||
self.area_check(box_tensor, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @NicolasHug highlighted above, we are trying to check float16 values that are caped at 65k against expected values at around 600k. This shouldn't work. We should revise the testing values to avoid overflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I understood now. Let me revise the tests,
…o refactor_ops_tests
test/test_ops.py
Outdated
dtype=torch.float16, | ||
) | ||
expected = torch.tensor([321.1562, 368.4375, 1852.2500], dtype=torch.float16) | ||
self.area_check(box_tensor, expected, atol=0.3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well sadly for small boxes and keeping both as float16. We need to loosen atol. I can't find some other workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to decipher the diff. What test does this correspond to in main
? It looks like input and generated values were changed. Is this on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is on purpose. So that box_area after multiplication too stays in float16 input format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important: The atol=0.3
is extremely large and might lead to not being able to detect problems. Is that the minimum you can do? If yes then it's worth considering reviewing the values of the boxes so you can use smaller values like 0.001
which I believe was used previously.
@oke-aditya Any more help/review needed here? I have some time this Friday for OSS, so let me know. Thanks. 👌 |
Seems we are very close to finish this. Sadly I'm busy at office and sunsetting my old laptop to move onto new one. (it's gonna take some time etc) You can help me wrap up this PR by simply sending a PR to my fork of torchvision. On this branch. I can merge that. Then Just ping Nicolas / Vasilis if anything else is needed. |
Alright @oke-aditya, will do this on Friday and let you know. Good luck resetting your computer (I hope you have some of your dot files saved somewhere. 😄 ) |
Yeah the challenge is that the new laptop has GPU. And I never had GPU locally before. So setting up that with nvidia drivers, building torchvision from source etc is gonna be a cool challenge. (Yes till now I did all contributions or learn deep learning without GPU :) colab was my messiah) |
@oke-aditya compiling from source with GPU should be seamless. If it's not, we will take your PR, thank you very much. 😄 |
Yeah hopefully :) Doing it first time. All should go well 😊 |
@oke-aditya After having a quick look, not sure what I can do here. If you still need help, please point to few things and I will be happy to help of course (this Saturday, I will have some time). Have a good weekend in advance. 👋 |
Major blockers are cleared! I could successfully build pytorch from source. Which is sadly needed as torch with cuda 11.7 binaries aren't available in nightlies yet :( I should wrap up the PR by Wednesday 🤠 |
@NicolasHug how do we go about this ? Is it good to go? |
@oke-aditya seems there is a related issue with the linter:
|
I can fix that 😃. Anything else apart from that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oke-aditya I've left 4 comments below but you need to take action only for the tolerance value (+ the linter)
Overall I'm tempted to merge this because:
- It's an improvement comparing to the previous tests which had various issues.
- It removes 200 lines of code from the tests
Your PR remained open for so long because it appears to be very lengthy and touched multiple lines. In practice you only make minor changes on the main part of the classes but Github fails to see them. There are other improvements that I felt could have been made in this PR but this is already too long and can be handled on a follow up. I think to speed up reviews, it's worth splitting work in chunks and actively trying to structure the code in a way that leads to a more compact Github diff (but I know this is not always possible).
After you make the changes on the linter and the tolerance value, I intend to merge unless someone else has strong opinion against it. cc @NicolasHug @pmeier
test/test_ops.py
Outdated
dtype=torch.float16, | ||
) | ||
expected = torch.tensor([321.1562, 368.4375, 1852.2500], dtype=torch.float16) | ||
self.area_check(box_tensor, expected, atol=0.3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important: The atol=0.3
is extremely large and might lead to not being able to detect problems. Is that the minimum you can do? If yes then it's worth considering reviewing the values of the boxes so you can use smaller values like 0.001
which I believe was used previously.
assert_iou_loss(ops.distance_box_iou_loss, box1, box1, 0.0, device=device) | ||
assert_iou_loss(ops.distance_box_iou_loss, box1, box2, 0.8125, device=device) | ||
assert_iou_loss(ops.distance_box_iou_loss, box1, box3, 1.1923, device=device) | ||
assert_iou_loss(ops.distance_box_iou_loss, box1, box4, 1.2500, device=device) | ||
assert_iou_loss(ops.distance_box_iou_loss, box1s, box2s, 1.2250, device=device, reduction="mean") | ||
assert_iou_loss(ops.distance_box_iou_loss, box1s, box2s, 2.4500, device=device, reduction="sum") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future improvement: I know that these values were also used originally but I think it's problematic that both cIoU and dIoU losses are being tested against examples that yield identical outputs. Aka if I send a PR that replaces cIoU with dIoU, how do we capture the issue?
@pytest.mark.parametrize("device", cpu_and_gpu()) | ||
@pytest.mark.parametrize("dtype", [torch.float32, torch.half]) | ||
@pytest.mark.parametrize("seed", [4, 5]) | ||
def test_jit(self, alpha, gamma, reduction, device, dtype, seed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future improvement: It's kinda problematic that the only loss that checks for JIT scriptability is Focal. Why this discrepancy in the testing strategy? I would imagine that a refactoring of the test ops would resolve this.
def test_empty_distance_iou_inputs(dtype, device) -> None: | ||
box1 = torch.randn([0, 4], dtype=dtype, device=device).requires_grad_() | ||
box2 = torch.randn([0, 4], dtype=dtype, device=device).requires_grad_() | ||
class TestFocalLoss: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<rant>
I took the previous code vs new and diffed them. This indicated that you made only minor modifications to the class which Github fails to highlight. This is where splitting the PR into multiple comes handy and can speed up reviews.</rant>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems I should learn to use ghstack https://github.com/ezyang/ghstack This would make it simple.
So I have fixed the required issues. I see that even more improvements can be done. And especially we need to address #6317 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Summary: * Refactor tests * Remove tol, fix comments * Add tolerance only where necessary * Add tolerance only where necessary * Add tolerance only where necessary * Refactor to adapt suggestions * Refactor and add nits * Refactor box area * Refactor to one file * Adapt almost all except area * final update * Tighten for jit * Refactor slightly * Fix tests Reviewed By: NicolasHug Differential Revision: D38351754 fbshipit-source-id: 31a7cc6b21a875bab52d5377c52c22c387407206 Co-authored-by: Vasilis Vryniotis <[email protected]>
Part 2. Closes #5976
Second part of refactor. This touches only the tests.
Currently have created 2 separate files so that the diff is slightly easier to review. I can merge those files back to test_ops.py if needed. 😄
cc @pmeier