-
Notifications
You must be signed in to change notification settings - Fork 7.1k
distance_box_iou() and complete_box_iou() don't work if both sets don't have the same number of boxes #6317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I would think that the fix is: centers_distance_squared = (_upcast(x_p[:, None] - x_g) ** 2) + (_upcast(y_p[:, None] - y_g) ** 2) |
There's also another point in aspect_gt = torch.atan(w_gt / h_gt)
aspect_pred = torch.atan(w_pred / h_pred)
v = (4 / (torch.pi**2)) * torch.pow((aspect_gt - aspect_pred[:, None]), 2) |
Hi @senarvi nice to see you here.
The reason why these are implemented to work as is because most detection operators need to calculate for all set of boxes. And this is what other repos used (E.g. DETR, etc)
The case where we need only the diagonal elements is for calculation of losses. (Are you trying to do |
Editing this comment: Thanks, @oke-aditya, for pointing out. The implementation is correct only. I used randn to generate the boxes and hence my interpretation was wrong. Sorry for the misleads. Will have to tackle the case where M != N |
Edit sorry for that, I completely missed that using rand will give you degenerate boxes and calculating IoU for it will make no sense. @abhi-glitchhg the implementation seems to be correct. We have just missed handling M != N That can be handled.
|
@senarvi Thanks for pointing this out. This is a bug. To ensure BC and alignment with previous @oke-aditya @abhi-glitchhg @yassineAlouini Anyone interested in patching this? |
I will patch this 😄 |
@oke-aditya Thanks! Please send a PR and update the tests accordingly to capture these issues going forwards. |
Exactly. Thanks @oke-aditya , I hadn't noticed the loss functions. I'm curious if there are other differences, apart from the fact that
|
We do upcast in losses. But we upcast non float as we don't currently support int dtype for losses. This is intentional. See vision/torchvision/ops/ciou_loss.py Line 48 in ec0e9e1
and vision/torchvision/ops/diou_loss.py Line 48 in ec0e9e1
Both are I also think that the loss shouldn't be zero if the boxes don't intersect. I think that's correct as a loss function should help to find the intersection. While Plain IoU (box_iou) should of course be 0 as boxes don't intersect. |
Got it. You upcast them already earlier. I was just curious if there were any functional differences, because there are many differences in how the code is written, but apparently not. (I did spot one additional Thanks for the clarification! |
Yeah for all the losses we support the eps parameter of course to avoid zero division error. |
@oke-aditya happy to review once the code is ready. 👌 |
Or could alternatively set that in this case iou is assumed 0 (which in practice makes sense for 0-area boxes) |
Also I don't know if it's intended but distance_box_iou can return negative numbers, in the case where 2 boxes doesn't touch at all. I know it's in the formula but at the same time it breaks a lot of things and I wonder if it should be caped at 0 |
@alexandre-dang Given it's on the formula our implementation needs to produce the right values. For specific use-cases where the user needs it strictly positive, they can do it outside of the method. |
AFAIK it's intended. That's the rationale behind generalized iou and other other iou methods such as distance and complete. Giving negative values gives a better feedback to neural network (when used as loss) and as a metric is more relevant. That's the advantage over vanilla box iou. |
Uh oh!
There was an error while loading. Please reload this page.
🐛 Describe the bug
*box_iou()
functions should return a matrix of results for every possible pair (box1, box2), where box1 is a box fromboxes1
and box2 is a box fromboxes2
.box_iou()
andgeneralized_box_iou()
work this way, i.e. ifboxes1
is an Nx4 matrix andboxes2
is an Mx4 matrix, the result is an NxM matrix. The recently addeddistance_box_iou()
andcomplete_box_iou()
don't work if there aren't as many boxes inboxes1
andboxes2
.When running the above code,
distance_box_iou()
andcomplete_box_iou()
will fail with aRuntimeError
. The output is below:This is not caught by the unit tests, because there's no such case where there's a different number of boxes in the two sets.
The problem is in
_box_diou_iou()
. It looks likeiou
anddiagonal_distance_squared
are calculated for every possible pair (by adding an empty dimension), butcenters_distance_squared
is not.As a side note, I personally feel it's confusing that these functions produce the output for every possible pair. By convention, PyTorch functions produce element-wise results. For example,
torch.add(boxes1, boxes2)
only works ifboxes1
andboxes2
contain the same number of boxes. If you want a pair-wise addition, you can easily calltorch.add(boxes1[:, None, :], boxes2)
. The fact that*box_iou()
functions produce pair-wise results makes the implementation complicated. And the only way to get element-wise results is callingbox_iou(boxes1, boxes2).diagonal()
, which is inefficient.Versions
PyTorch version: 1.12.0+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.4 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31
Python version: 3.9.12 (main, Apr 5 2022, 06:56:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1650 Ti with Max-Q Design
Nvidia driver version: 516.59
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] mypy==0.950
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.22.3
[pip3] pytorch-lightning==1.6.5
[pip3] pytorch-lightning-bolts==0.2.5
[pip3] pytorch-quantization==2.1.2
[pip3] torch==1.12.0+cu113
[pip3] torchmetrics==0.6.0
[pip3] torchtext==0.12.0
[pip3] torchvision==0.13.0+cu113
[conda] Could not collect
The text was updated successfully, but these errors were encountered: