Refactor tests for ops #6027

oke-aditya · 2022-05-16T16:30:01Z

Part 2. Closes #5976
Second part of refactor. This touches only the tests.

Currently have created 2 separate files so that the diff is slightly easier to review. I can merge those files back to test_ops.py if needed. 😄

cc @pmeier

test/test_ious.py

test/test_losses.py

NicolasHug · 2022-05-16T16:34:52Z

Thanks @oke-aditya ,

Before moving large section of tests into separate files, I think it might be worth reaching consensus in #6020 first. I'm a bit wary about such changes as they're really not as free as they might seem.
Do you think it'd be possible to make the relevant changes in this PR while still preserving the current file structure?

test/test_ops.py

oke-aditya · 2022-05-16T16:38:00Z

test/test_ops.py


-    def test_ciou_jit(self) -> None:
-        self._run_jit_test([[0, 0, 100, 100], [0, 0, 50, 50], [200, 200, 300, 300]])
+    def test_float16_box(self):


We can probably make stuff a bit compact with parametrize with pytest.params. Hmm but I was like it's ok this isn't so bad?

Do you know why this check is special cases and not grouped with the other float dtypes?

Not sure probably to check overflow from float16 to float32?

Could you please check if the test above works with float16 and potentially remove this one?

oke-aditya · 2022-05-16T16:41:03Z

Yes @NicolasHug I get your point. The idea is

Currently have created 2 separate files so that the diff is slightly easier to review. I can merge those files back to test_ops.py if needed. smile

Let's finish reviewing and adjusting the code, i will eventually move back code to same file. I took it out so that I could analyse it independently and made it slightly easier to run tests. We won't create this file structure at end of PR.

test/test_ious.py

pmeier · 2022-05-17T06:44:55Z

test/test_losses.py

+    loss = iou_fn(box1, box2, reduction="mean")
+    loss.backward()
+    tol = 1e-3 if dtype is torch.half else 1e-5
+    torch.testing.assert_close(loss, torch.tensor(0.0, device=device), rtol=tol, atol=tol)


Wait, we want the loss to be 0 if no boxes, i.e. boxes with a degenerate shape, were passed and we use reduction="mean"? But if we set reduction="none" the loss should be empty?

test/test_losses.py

test/test_ops.py

pmeier · 2022-05-17T06:59:14Z

test/test_ops.py


-    def test_ciou_jit(self) -> None:
-        self._run_jit_test([[0, 0, 100, 100], [0, 0, 50, 50], [200, 200, 300, 300]])
+    def test_float16_box(self):


Do you know why this check is special cases and not grouped with the other float dtypes?

oke-aditya · 2022-05-21T05:13:46Z

I will get back to this on Tuesday. I'm shifting homes this week 🏡.

oke-aditya · 2022-05-25T13:43:08Z

I think this weekend might give me some time. :(

datumbox · 2022-05-25T13:44:50Z

@oke-aditya No worries. We thankfully managed to merge the PR that cleaned up the code prior the release. This is less urgent.

…r_ops_tests

yassineAlouini · 2022-06-06T08:56:48Z

@oke-aditya Do you need help finishing this? I have some dedicated time this afternoon (UTC +2 time) so I can review the code and/or help finishing it. Let me know. 👌

oke-aditya · 2022-06-06T08:59:34Z

Hey I'm back on this. Was about to give time on this today. I have finally settled in new (or perhaps old?) home town 😄

Feel free to help in reviewing as Philip is on PTO.

yassineAlouini · 2022-06-06T09:01:26Z

Hey I'm back on this. Was about to give time on this today. I have finally settled in new (or perhaps old?) home town smile

Feel free to help in reviewing as Philip is on PTO.

I will. Good luck with what remains. 👍

…r_ops_tests

test/test_ious.py

test/test_losses.py

test/test_ops.py

…r_ops_tests

oke-aditya · 2022-06-06T18:34:58Z

Hi @NicolasHug , @datumbox , @YosuaMichael could someone please review the PR as Philip is having a well deserved PTO :)

datumbox

Thanks a lot @oke-aditya! I had a quick look and looks an improvement. I got only a few nits.

Also splitting the tests in separate files is definitely worth it IMO, but I would expect that the box op tests would all be together, the losses all together etc. Definitely not a blocking issue for me and we can review the structure on a future PR.

test/test_ious.py

test/test_ops.py

datumbox · 2022-06-10T10:56:30Z

test/test_ops.py

+            dtype=torch.float16,
+        )
+        expected = torch.tensor([605113.875, 600495.1875, 592247.25], dtype=torch.float32)
+        self.area_check(box_tensor, expected)


As @NicolasHug highlighted above, we are trying to check float16 values that are caped at 65k against expected values at around 600k. This shouldn't work. We should revise the testing values to avoid overflows.

Yep. I understood now. Let me revise the tests,

…o refactor_ops_tests

test/test_ops.py

oke-aditya · 2022-06-10T11:25:34Z

test/test_ops.py

+            dtype=torch.float16,
+        )
+        expected = torch.tensor([321.1562, 368.4375, 1852.2500], dtype=torch.float16)
+        self.area_check(box_tensor, expected, atol=0.3)


Well sadly for small boxes and keeping both as float16. We need to loosen atol. I can't find some other workaround.

It's hard to decipher the diff. What test does this correspond to in main? It looks like input and generated values were changed. Is this on purpose?

It is on purpose. So that box_area after multiplication too stays in float16 input format.

Important: The atol=0.3 is extremely large and might lead to not being able to detect problems. Is that the minimum you can do? If yes then it's worth considering reviewing the values of the boxes so you can use smaller values like 0.001 which I believe was used previously.

test/test_ops.py

yassineAlouini · 2022-06-15T08:04:53Z

@oke-aditya Any more help/review needed here? I have some time this Friday for OSS, so let me know. Thanks. 👌

oke-aditya · 2022-06-15T08:07:29Z

Seems we are very close to finish this. Sadly I'm busy at office and sunsetting my old laptop to move onto new one. (it's gonna take some time etc)

You can help me wrap up this PR by simply sending a PR to my fork of torchvision. On this branch. I can merge that.

Then Just ping Nicolas / Vasilis if anything else is needed.

yassineAlouini · 2022-06-15T08:27:25Z

Seems we are very close to finish this. Sadly I'm busy at office and sunsetting my old laptop to move onto new one. (it's gonna take some time etc)

You can help me wrap up this PR by simply sending a PR to my fork of torchvision. On this branch. I can merge that.

Then Just ping Nicolas / Vasilis if anything else is needed.

Alright @oke-aditya, will do this on Friday and let you know. Good luck resetting your computer (I hope you have some of your dot files saved somewhere. 😄 )

oke-aditya · 2022-06-15T18:08:02Z

Yeah the challenge is that the new laptop has GPU. And I never had GPU locally before. So setting up that with nvidia drivers, building torchvision from source etc is gonna be a cool challenge.

(Yes till now I did all contributions or learn deep learning without GPU :) colab was my messiah)

datumbox · 2022-06-15T19:20:29Z

@oke-aditya compiling from source with GPU should be seamless. If it's not, we will take your PR, thank you very much. 😄

oke-aditya · 2022-06-15T19:21:17Z

Yeah hopefully :) Doing it first time. All should go well 😊

yassineAlouini · 2022-06-17T14:40:05Z

@oke-aditya After having a quick look, not sure what I can do here. If you still need help, please point to few things and I will be happy to help of course (this Saturday, I will have some time). Have a good weekend in advance. 👋

oke-aditya · 2022-06-19T19:38:43Z

Major blockers are cleared!

I could successfully build pytorch from source. Which is sadly needed as torch with cuda 11.7 binaries aren't available in nightlies yet :(

I should wrap up the PR by Wednesday 🤠

oke-aditya · 2022-07-12T15:07:44Z

@NicolasHug how do we go about this ? Is it good to go?

datumbox · 2022-07-25T14:20:36Z

@oke-aditya seems there is a related issue with the linter:

diff --git a/test/test_ops.py b/test/test_ops.py
index 53637baf3..db2ba4776 100644
--- a/test/test_ops.py
+++ b/test/test_ops.py
@@ -3,7 +3,7 @@ import os
 from abc import ABC, abstractmethod
 from functools import lru_cache
 from itertools import product
-from typing import Tuple, Callable, List
+from typing import Callable, List, Tuple
 
 import numpy as np
 import pytest

oke-aditya · 2022-07-25T14:21:26Z

I can fix that 😃. Anything else apart from that?

datumbox

@oke-aditya I've left 4 comments below but you need to take action only for the tolerance value (+ the linter)

Overall I'm tempted to merge this because:

It's an improvement comparing to the previous tests which had various issues.
It removes 200 lines of code from the tests

Your PR remained open for so long because it appears to be very lengthy and touched multiple lines. In practice you only make minor changes on the main part of the classes but Github fails to see them. There are other improvements that I felt could have been made in this PR but this is already too long and can be handled on a follow up. I think to speed up reviews, it's worth splitting work in chunks and actively trying to structure the code in a way that leads to a more compact Github diff (but I know this is not always possible).

After you make the changes on the linter and the tolerance value, I intend to merge unless someone else has strong opinion against it. cc @NicolasHug @pmeier

datumbox · 2022-07-25T15:17:24Z

test/test_ops.py

+            dtype=torch.float16,
+        )
+        expected = torch.tensor([321.1562, 368.4375, 1852.2500], dtype=torch.float16)
+        self.area_check(box_tensor, expected, atol=0.3)


Important: The atol=0.3 is extremely large and might lead to not being able to detect problems. Is that the minimum you can do? If yes then it's worth considering reviewing the values of the boxes so you can use smaller values like 0.001 which I believe was used previously.

datumbox · 2022-07-25T15:29:43Z

test/test_ops.py

+        assert_iou_loss(ops.distance_box_iou_loss, box1, box1, 0.0, device=device)
+        assert_iou_loss(ops.distance_box_iou_loss, box1, box2, 0.8125, device=device)
+        assert_iou_loss(ops.distance_box_iou_loss, box1, box3, 1.1923, device=device)
+        assert_iou_loss(ops.distance_box_iou_loss, box1, box4, 1.2500, device=device)
+        assert_iou_loss(ops.distance_box_iou_loss, box1s, box2s, 1.2250, device=device, reduction="mean")
+        assert_iou_loss(ops.distance_box_iou_loss, box1s, box2s, 2.4500, device=device, reduction="sum")


Future improvement: I know that these values were also used originally but I think it's problematic that both cIoU and dIoU losses are being tested against examples that yield identical outputs. Aka if I send a PR that replaces cIoU with dIoU, how do we capture the issue?

datumbox · 2022-07-25T15:31:13Z

test/test_ops.py

+    @pytest.mark.parametrize("device", cpu_and_gpu())
+    @pytest.mark.parametrize("dtype", [torch.float32, torch.half])
+    @pytest.mark.parametrize("seed", [4, 5])
+    def test_jit(self, alpha, gamma, reduction, device, dtype, seed):


Future improvement: It's kinda problematic that the only loss that checks for JIT scriptability is Focal. Why this discrepancy in the testing strategy? I would imagine that a refactoring of the test ops would resolve this.

datumbox · 2022-07-25T15:38:29Z

test/test_ops.py

-def test_empty_distance_iou_inputs(dtype, device) -> None:
-    box1 = torch.randn([0, 4], dtype=dtype, device=device).requires_grad_()
-    box2 = torch.randn([0, 4], dtype=dtype, device=device).requires_grad_()
+class TestFocalLoss:


<rant>I took the previous code vs new and diffed them. This indicated that you made only minor modifications to the class which Github fails to highlight. This is where splitting the PR into multiple comes handy and can speed up reviews.</rant>

Seems I should learn to use ghstack https://github.com/ezyang/ghstack This would make it simple.

oke-aditya · 2022-07-27T06:17:02Z

So I have fixed the required issues. I see that even more improvements can be done. And especially we need to address #6317
and add tests accordingly. I will follow up with a new PR.

datumbox

LGTM, thanks!

Summary: * Refactor tests * Remove tol, fix comments * Add tolerance only where necessary * Add tolerance only where necessary * Add tolerance only where necessary * Refactor to adapt suggestions * Refactor and add nits * Refactor box area * Refactor to one file * Adapt almost all except area * final update * Tighten for jit * Refactor slightly * Fix tests Reviewed By: NicolasHug Differential Revision: D38351754 fbshipit-source-id: 31a7cc6b21a875bab52d5377c52c22c387407206 Co-authored-by: Vasilis Vryniotis <[email protected]>

Refactor tests

7da0a41

facebook-github-bot added the cla signed label May 16, 2022