distance_box_iou() and complete_box_iou() don't work if both sets don't have the same number of boxes #6317

senarvi · 2022-07-26T12:33:07Z

🐛 Describe the bug

*box_iou() functions should return a matrix of results for every possible pair (box1, box2), where box1 is a box from boxes1 and box2 is a box from boxes2. box_iou() and generalized_box_iou() work this way, i.e. if boxes1 is an Nx4 matrix and boxes2 is an Mx4 matrix, the result is an NxM matrix. The recently added distance_box_iou() and complete_box_iou() don't work if there aren't as many boxes in boxes1 and boxes2.

import torch
from torchvision.ops import box_iou, generalized_box_iou, distance_box_iou, complete_box_iou

N = 5
M = 6
boxes1 = torch.rand((N, 4))
boxes2 = torch.rand((M, 4))
print(box_iou(boxes1, boxes2).shape)  # torch.Size([5, 6])
print(generalized_box_iou(boxes1, boxes2).shape)  # torch.Size([5, 6])
print(distance_box_iou(boxes1, boxes2).shape)  # RuntimeError
print(complete_box_iou(boxes1, boxes2).shape)  # RuntimeError

When running the above code, distance_box_iou() and complete_box_iou() will fail with a RuntimeError. The output is below:

torch.Size([5, 6])
torch.Size([5, 6])
Traceback (most recent call last):
  File ".../test.py", line 10, in <module>
    print(distance_box_iou(boxes1, boxes2).shape)  # RuntimeError
  File ".../lib/python3.9/site-packages/torchvision/ops/boxes.py", line 361, in distance_box_iou
    diou, _ = _box_diou_iou(boxes1, boxes2)
  File ".../lib/python3.9/site-packages/torchvision/ops/boxes.py", line 378, in _box_diou_iou
    centers_distance_squared = (_upcast(x_p - x_g) ** 2) + (_upcast(y_p - y_g) ** 2)
RuntimeError: The size of tensor a (5) must match the size of tensor b (6) at non-singleton dimension 0

This is not caught by the unit tests, because there's no such case where there's a different number of boxes in the two sets.

The problem is in _box_diou_iou(). It looks like iou and diagonal_distance_squared are calculated for every possible pair (by adding an empty dimension), but centers_distance_squared is not.

As a side note, I personally feel it's confusing that these functions produce the output for every possible pair. By convention, PyTorch functions produce element-wise results. For example, torch.add(boxes1, boxes2) only works if boxes1 and boxes2 contain the same number of boxes. If you want a pair-wise addition, you can easily call torch.add(boxes1[:, None, :], boxes2). The fact that *box_iou() functions produce pair-wise results makes the implementation complicated. And the only way to get element-wise results is calling box_iou(boxes1, boxes2).diagonal(), which is inefficient.

Versions

PyTorch version: 1.12.0+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.4 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.9.12 (main, Apr 5 2022, 06:56:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1650 Ti with Max-Q Design
Nvidia driver version: 516.59
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy==0.950
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.22.3
[pip3] pytorch-lightning==1.6.5
[pip3] pytorch-lightning-bolts==0.2.5
[pip3] pytorch-quantization==2.1.2
[pip3] torch==1.12.0+cu113
[pip3] torchmetrics==0.6.0
[pip3] torchtext==0.12.0
[pip3] torchvision==0.13.0+cu113
[conda] Could not collect

The text was updated successfully, but these errors were encountered:

senarvi · 2022-07-26T12:58:34Z

I would think that the fix is:

centers_distance_squared = (_upcast(x_p[:, None] - x_g) ** 2) + (_upcast(y_p[:, None] - y_g) ** 2)

senarvi · 2022-07-26T14:38:24Z

There's also another point in complete_box_iou() with a similar bug. I think it can be fixed like this:

aspect_gt = torch.atan(w_gt / h_gt)
aspect_pred = torch.atan(w_pred / h_pred)
v = (4 / (torch.pi**2)) * torch.pow((aspect_gt - aspect_pred[:, None]), 2)

oke-aditya · 2022-07-26T17:27:56Z

Hi @senarvi nice to see you here.
Hmm agreed, it might have been overlooked with implementing and testing. cc @datumbox should we fix this up? I can go ahead with it.

As a side note, I personally feel it's confusing that these functions produce the output for every possible pair. By convention, PyTorch functions produce element-wise results. For example, torch.add(boxes1, boxes2) only works if boxes1 and boxes2 contain the same number of boxes. If you want a pair-wise addition, you can easily call torch.add(boxes1[:, None, :], boxes2). The fact that *box_iou() functions produce pair-wise results makes the implementation complicated.

The reason why these are implemented to work as is because most detection operators need to calculate for all set of boxes. And this is what other repos used (E.g. DETR, etc)

And the only way to get element-wise results is calling box_iou(boxes1, boxes2).diagonal(), which is inefficient.

The case where we need only the diagonal elements is for calculation of losses. (Are you trying to do loss = 1 - box_iou.diagonal()). We do have an optimized version for calculating the loss. See
https://github.com/pytorch/vision/blob/main/torchvision/ops/_utils.py#L87

abhi-glitchhg · 2022-07-26T17:29:08Z

Editing this comment:

Thanks, @oke-aditya, for pointing out. The implementation is correct only. I used randn to generate the boxes and hence my interpretation was wrong. Sorry for the misleads.

Will have to tackle the case where M != N

oke-aditya · 2022-07-26T18:02:37Z

Edit sorry for that, I completely missed that using rand will give you degenerate boxes and calculating IoU for it will make no sense.

@abhi-glitchhg the implementation seems to be correct. We have just missed handling M != N

That can be handled.

>>> import torch, torchvision
>>> boxes1 = torch.tensor([[285.25, 185.625, 1194.0, 851.5], [285.25, 188.75, 1192.0, 851.0], [279.25, 198.0, 1189.0, 849.0]], dtype=torch.float)
>>> boxes2 = torch.tensor([[285.25, 185.625, 1194.0, 851.5], [285.25, 188.75, 1192.0, 851.0], [279.25, 198.0, 1189.0, 849.0]], dtype=torch.float)
>>> c_iou1 = 1 - torchvision.ops.complete_box_iou(boxes1, boxes2)
>>> c_iou2 = torchvision.ops.complete_box_iou_loss(boxes1, boxes2)
>>> c_iou1
tensor([[0.0000, 0.0076, 0.0340],
        [0.0076, 0.0000, 0.0266],
        [0.0340, 0.0266, 0.0000]])
>>> c_iou2
tensor([0., 0., 0.])
>>> boxes2 = torch.tensor([[285.3538, 185.5758, 1193.5110, 851.4551],[285.1472, 188.7374, 1192.4984, 851.0669], [279.2440, 197.9812, 1189.4746, 849.2019]] , dtype=torch.float)
>>> c_iou1 = 1 - torchvision.ops.complete_box_iou(boxes1, boxes2)
>>> c_iou2 = torchvision.ops.complete_box_iou_loss(boxes1, boxes2)
>>> c_iou1
tensor([[0.0008, 0.0071, 0.0331],
        [0.0072, 0.0008, 0.0257],
        [0.0336, 0.0271, 0.0009]])
>>> c_iou2
tensor([0.0008, 0.0008, 0.0009])
>>>

datumbox · 2022-07-27T09:05:03Z

@senarvi Thanks for pointing this out. This is a bug.

To ensure BC and alignment with previous *_box_iou() methods, we needed to maintain the NxM feature. Note that the equivalent loss methods should be doing element-wise calculations to be efficient, so any bug fix shouldn't be making the losses to estimate extra unnecessary info. This is something that was discussed on the original PRs.

@oke-aditya @abhi-glitchhg @yassineAlouini Anyone interested in patching this?

oke-aditya · 2022-07-27T09:12:52Z

I will patch this 😄

datumbox · 2022-07-27T09:24:58Z

@oke-aditya Thanks! Please send a PR and update the tests accordingly to capture these issues going forwards.

senarvi · 2022-07-27T11:22:45Z

The case where we need only the diagonal elements is for calculation of losses. (Are you trying to do loss = 1 - box_iou.diagonal()). We do have an optimized version for calculating the loss. See https://github.com/pytorch/vision/blob/main/torchvision/ops/_utils.py#L87

Exactly. Thanks @oke-aditya , I hadn't noticed the loss functions.

I'm curious if there are other differences, apart from the fact that _loss_inter_union() produces element-wise results. I can see two differences:

_box_inter_union() upcasts the box coordinates before calculating the box and intersection areas. _loss_inter_union() does not. Is this intentional?
_box_inter_union() uses clamp(min=0) to make sure that the intersection is zero if the boxes don't intersect. _loss_inter_union() explicitly checks that the width and the height are positive. Does this make _loss_inter_union() more stable? (I don't see how.)

oke-aditya · 2022-07-27T13:11:57Z

_box_inter_union() upcasts the box coordinates before calculating the box and intersection areas. _loss_inter_union() does not. Is this intentional?

We do upcast in losses. But we upcast non float as we don't currently support int dtype for losses. This is intentional.

See

vision/torchvision/ops/ciou_loss.py

Line 48 in ec0e9e1

boxes1 = _upcast_non_float(boxes1)

and

vision/torchvision/ops/diou_loss.py

Line 48 in ec0e9e1

boxes1 = _upcast_non_float(boxes1)

_box_inter_union() uses clamp(min=0) to make sure that the intersection is zero if the boxes don't intersect. _loss_inter_union() explicitly checks that the width and the height are positive. Does this make _loss_inter_union() more stable? (I don't see how.)

Both are _ functions and aren't exposed as such for use, so we don't guarantee BC in either.
All the added losses and ops are stable. (I hope I'm right @datumbox ?)

I also think that the loss shouldn't be zero if the boxes don't intersect. I think that's correct as a loss function should help to find the intersection. While Plain IoU (box_iou) should of course be 0 as boxes don't intersect.

senarvi · 2022-07-27T14:14:37Z

Got it. You upcast them already earlier.

I was just curious if there were any functional differences, because there are many differences in how the code is written, but apparently not. (I did spot one additional eps in _diou_iou_loss()).

Thanks for the clarification!

oke-aditya · 2022-07-27T14:53:51Z

Yeah for all the losses we support the eps parameter of course to avoid zero division error.

yassineAlouini · 2022-07-27T15:25:47Z

@oke-aditya happy to review once the code is ready. 👌
Thanks in adavance for the fix.

vadimkantorov · 2022-07-31T11:59:08Z

Yeah for all the losses we support the eps parameter of course to avoid zero division error.

Or could alternatively set that in this case iou is assumed 0 (which in practice makes sense for 0-area boxes)

alexandre-dang · 2022-08-03T15:42:17Z

Also I don't know if it's intended but distance_box_iou can return negative numbers, in the case where 2 boxes doesn't touch at all. I know it's in the formula but at the same time it breaks a lot of things and I wonder if it should be caped at 0

datumbox · 2022-08-03T15:50:34Z

@alexandre-dang Given it's on the formula our implementation needs to produce the right values. For specific use-cases where the user needs it strictly positive, they can do it outside of the method.

oke-aditya · 2022-08-03T16:12:05Z

AFAIK it's intended. That's the rationale behind generalized iou and other other iou methods such as distance and complete.

Giving negative values gives a better feedback to neural network (when used as loss) and as a metric is more relevant.

That's the advantage over vanilla box iou.

oke-aditya mentioned this issue Jul 27, 2022

Refactor tests for ops #6027

Merged

datumbox added the bug label Jul 27, 2022

oke-aditya mentioned this issue Jul 31, 2022

Fix d/c IoU for different batch sizes #6338

Merged

datumbox closed this as completed in #6338 Aug 2, 2022

senarvi mentioned this issue Aug 19, 2022

Add features to the YOLO model from the latest YOLO variants Lightning-Universe/lightning-bolts#817

Merged

8 tasks

distance_box_iou() and complete_box_iou() don't work if both sets don't have the same number of boxes #6317

distance_box_iou() and complete_box_iou() don't work if both sets don't have the same number of boxes #6317

Comments

senarvi commented Jul 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐛 Describe the bug

Versions

senarvi commented Jul 26, 2022

Uh oh!

senarvi commented Jul 26, 2022

Uh oh!

oke-aditya commented Jul 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abhi-glitchhg commented Jul 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oke-aditya commented Jul 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

datumbox commented Jul 27, 2022

Uh oh!

oke-aditya commented Jul 27, 2022

Uh oh!

datumbox commented Jul 27, 2022

Uh oh!

senarvi commented Jul 27, 2022

Uh oh!

oke-aditya commented Jul 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

senarvi commented Jul 27, 2022

Uh oh!

oke-aditya commented Jul 27, 2022

Uh oh!

yassineAlouini commented Jul 27, 2022

Uh oh!

vadimkantorov commented Jul 31, 2022

Uh oh!

alexandre-dang commented Aug 3, 2022

Uh oh!

datumbox commented Aug 3, 2022

Uh oh!

oke-aditya commented Aug 3, 2022

Uh oh!

senarvi commented Jul 26, 2022 •

edited

Loading

oke-aditya commented Jul 26, 2022 •

edited

Loading

abhi-glitchhg commented Jul 26, 2022 •

edited

Loading

oke-aditya commented Jul 26, 2022 •

edited

Loading

oke-aditya commented Jul 27, 2022 •

edited

Loading