Skip to content

Inconsistent API for tensor and PIL image for CenterCrop #3297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
saurabheights opened this issue Jan 26, 2021 · 15 comments · Fixed by #3333
Closed

Inconsistent API for tensor and PIL image for CenterCrop #3297

saurabheights opened this issue Jan 26, 2021 · 15 comments · Fixed by #3333

Comments

@saurabheights
Copy link
Contributor

saurabheights commented Jan 26, 2021

🐛 Bug

CenterCrop transform crops image center. However, when crop size is smaller than image size(in any dimension), the results are different for PIL.Image input and tensor input.

For PIL.Image, it does padding, while for tensor, the results are completely wrong, due to incorrect crop location computation(negative position values computed).

To Reproduce

Steps to reproduce the behavior:

import torch
import torchvision.transforms as transforms
import torchvision.transforms.functional as TF

pil_x = TF.to_pil_image(torch.randn(3, 24, 24))  
transforms.CenterCrop(50)(pil_x).size  # PIL image size is 50x50

tensor_x = torch.randn(3, 24, 24)
transforms.CenterCrop(50)(tensor_x).shape  # Tensor shape is torch.Size([3, 12, 12])

Expected behavior

Mainly, API should be consistent for both inputs or there should be two different methods for PIL & Tensor input(which is IMO less appealing).
Side note - docs should be updated for what happens when crop_size is greater than image size. This condition, I believe is missing in documentation of other crop methods as well.

Environment

Collecting environment information...
PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Quadro T2000
Nvidia driver version: 450.102.04
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.7.0
[pip3] torchaudio==0.7.0a0+ac17b64
[pip3] torchvision==0.8.1
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               10.2.89              hfd86e86_1  
[conda] mkl                       2020.2                      256  
[conda] mkl-service               2.3.0            py38he904b0f_0  
[conda] mkl_fft                   1.2.0            py38h23d657b_0  
[conda] mkl_random                1.1.1            py38h0573a6f_0  
[conda] numpy                     1.19.2           py38h54aff64_0  
[conda] numpy-base                1.19.2           py38hfa32c7d_0  
[conda] pytorch                   1.7.0           py3.8_cuda10.2.89_cudnn7.6.5_0    pytorch
[conda] torchaudio                0.7.0                      py38    pytorch
[conda] torchvision               0.8.1                py38_cu102    pytorch

cc @vfdev-5

@voldemortX
Copy link
Contributor

voldemortX commented Jan 26, 2021

Just when I thought I missed one spot... #3295
I added this in #3224 for now, what do you think?
cc @datumbox

@saurabheights
Copy link
Contributor Author

saurabheights commented Jan 26, 2021

P.S. A lot of the functionality needed is similar to RandomCrop, which provides flexibility of padding, so would be a good idea to consider integration of those options to CenterCrop, since beside the offset computation, rest cropping part would be same.

Applying RandomCrop on both PIL image and tensor image shows same content(but with different offset).

transforms.RandomCrop(50, pad_if_needed=True)(pil_x).show()
transforms.ToPILImage()((transforms.RandomCrop(50, pad_if_needed=True)(tensor_x))).show()

Big HP Fan :D, He-Who-Must-Not-Be-Named.

@saurabheights
Copy link
Contributor Author

One last thing: I can work on its PR. Let me know if its alright by you.

@voldemortX
Copy link
Contributor

One last thing: I can work on its PR. Let me know if its alright by you.

That sounds great! Let's wait for the green light from a member.

@datumbox
Copy link
Contributor

@saurabheights Thanks for reporting and for making the issue reproducible. That's definitely a bug. We are happy to accept a PR that solves the problem.

Concerning the proposal of extending the functionality and adjusting the API, I think this should be discussed on a separate issue because we need to assess the backward compatibility.

@vfdev-5 Any thoughts on this?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Jan 26, 2021

Fun bug with PIL and Tensors. I think it would make sense to raise an error saying that we can not crop to a larger size.

import torch
import torchvision.transforms as transforms
import torchvision.transforms.functional as TF
import numpy as np


pil_x = TF.to_pil_image(127 * torch.ones(3, 100, 100))  
print(transforms.CenterCrop(150)(pil_x).size)  # PIL image size is 50x50
print(np.asarray(pil_x).shape)

tensor_x = torch.randn(3, 100, 100)
print(transforms.CenterCrop(150)(tensor_x).shape)
> (150, 150)
> (100, 100, 3)
> torch.Size([3, 24, 24])

EDIT: My bad about pillow output size as 100x100. It is 150x150 and output image is padded.

@saurabheights
Copy link
Contributor Author

saurabheights commented Jan 26, 2021

@vfdev-5 You mentioned:-

print(transforms.CenterCrop(150)(pil_x).size) # PIL image size is 50x50

Did you mean "150x150" here?

PIL seems to give correct output so far, so raising error will cause backward incompatibility. Will that be fine?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Jan 26, 2021

@vfdev-5 You mentioned:-

print(transforms.CenterCrop(150)(pil_x).size) # PIL image size is 50x50

Did you mean "150x150" here?

I mean, something like

# x = PIL_image or tensor
with pytest.raises(ValueError, match=r"Crop size should be smaller than input size"):
    transforms.CenterCrop(150)(x)

PIL seems to give correct output so far, so raising error will cause backward incompatibility. Will that be fine?

No, PIL is incorrect also as data is still 100x100. I mean it didn't pad the image.
IMO, there is no BC change with raising an exception for both cases...

@saurabheights
Copy link
Contributor Author

@vfdev-5 First, I am really sorry if I am mistaken/confused and would really appreciate for your time and explaination here.

No, PIL is incorrect also as data is still 100x100. I mean it didn't pad the image.

There is padding happening to output and not to the input to CenterCrop method, which is expected, the input should not be changed. I say should after glossing over various transform, and I didnt see any inplace ops to modify input.

The first image I1 shows a sample image from PASCAL and CenterCrop output with cropsize larger than image size. It does the padding to output.

Screenshot from 2021-01-26 16-59-01

The second image I2 shows that even when cropsize is smaller than imagesize, input image to CenterCrop method is not modified. I only ran this test to see if my assumption was correct regarding transform not changing input.

Screenshot from 2021-01-26 17-17-54

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Jan 26, 2021

First, I am really sorry if I am mistaken/confused and would really appreciate for your time and explaination here.

@saurabheights No worries, I also appreciate your time spending on this issue and checking things ! Thanks a lot !

It is very interesting example and this is not what I got in my tests. Let me recheck. Meanwhile, what is the version of PIL you have ?

EDIT:
Sorry, my bad I checked the size of initial image and not the output. Yes, you are right. PIL pads the image if crop is larger.

Well, it is a bug on tensor input to fix. We already did something similar for padding if padding requires crop:

def _pad_symmetric(img: Tensor, padding: List[int]) -> Tensor:

@saurabheights
Copy link
Contributor Author

No worries and thank you for pad_symmetric, it would work fine here.

Tested on PIL version '6.2.1'(found this is too old, so retested on newer versions :D) and '7.2.0'. And yep, there was padding in both.

Regarding ETA - I will hopefully make changes in 3/4 days(its only few hour work but working first time on vision lib, so might get delayed in setup by a little).

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Jan 26, 2021

Sounds good! Please, take a look at our contributing guide which may help you with the dev setup etc. A draft PR is also good such that we can iterate over the implementation if needed and do not hesitate to ask questions here for details.

@saurabheights
Copy link
Contributor Author

saurabheights commented Jan 31, 2021

I have finished the changes here:- master...saurabheights:bugfix/center-cropsize-greater-imgsize

I have conda env, with pytorch-nightly installed. I built torchvision from source code+my_fix.
Now, I have tested transforms, but when running full test suite, I face quite a few errors. Pytest summary(added pic as it will be much easily readable):-

Screenshot from 2021-01-31 01-42-00

AFAIK, none of any failed test is related to my changes. Shall I make the PR or fix these issues first?

@datumbox
Copy link
Contributor

datumbox commented Feb 1, 2021

@saurabheights The errors you get on the screenshot seem to be related to your development environment, not to your changes. You can try fixing these so that you can contribute easier on the future but if you don't have the time to fix this locally you just confirm that unit-tests that are relevant to your changes pass and send a PR. Our CI will run all the tests anyway, so we will be able to see if anything breaks.

@saurabheights
Copy link
Contributor Author

Thanks.
had to install cuda lib and provide CUDA_HOME variable to setup.py. Still had 6 errors, but they were related to GPU going out of mem on bigger networks. I will go ahead and submit PR.

============================================================================================= short test summary info =============================================================================================
FAILED test_models.py::ModelTester::test_fasterrcnn_mobilenet_v3_large_320_fpn_cuda - RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a...
FAILED test_models.py::ModelTester::test_fasterrcnn_mobilenet_v3_large_fpn_cuda - RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CU...
FAILED test_models.py::ModelTester::test_fasterrcnn_resnet50_fpn_cuda - RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 3.82 GiB total capacity; 2.49 GiB already allocated; 20.19 MiB fre...
FAILED test_models.py::ModelTester::test_fasterrcnn_switch_devices - RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 3.82 GiB total capacity; 2.52 GiB already allocated; 6.12 MiB free; 2....
FAILED test_models.py::ModelTester::test_keypointrcnn_resnet50_fpn_cuda - RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 3.82 GiB total capacity; 2.52 GiB already allocated; 6.12 MiB fre...
FAILED test_models.py::ModelTester::test_maskrcnn_resnet50_fpn_cuda - RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, ld...
============================================================================ 6 failed, 1 passed, 559 deselected, 4 warnings in 13.26s =============================================================================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants