Skip to content

Making transformation an optional parameter in FasterRCNN #2263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Sentient07 opened this issue May 27, 2020 · 9 comments
Open

Making transformation an optional parameter in FasterRCNN #2263

Sentient07 opened this issue May 27, 2020 · 9 comments

Comments

@Sentient07
Copy link

Sentient07 commented May 27, 2020

Hello,

🚀 Feature

I think it would be more generic to have transform(

transform = GeneralizedRCNNTransform(min_size, max_size, image_mean, image_std)
) as a function that can be modified by users rather than a default one.

Motivation

I am applying transformations separately as a part of data augmentation, which includes cropping and resizing. Hence I would prefer to not do the twice while retraining FasterRCNN.

Pitch

I would like to have a fixed size input to be fed into the network for variable-sized images. At present, I do this by resizing the images separately as a part of DataLoader and adjust the parameters of GeneralizedRCNNTransform accordingly.

Alternatives

My present way of using FasterRCNN is an alternative. Since my set of transformations are pre-defined, I have to apply hacks such as setting mean to 0., std to 1. and altering min and max sizes to my default value(this would mean that scale=1 and interpolation would return the same image.

Additional context

While the input to the network is fixed size, I apply many other augmentations such as mirror, random cropping etc, inspired by SSD based networks. Hence I would prefer to do all augmentation in a separate place once instead of twice.

Thank you!

Edit : If you think this would be a meaningful change, I will be happy to send a Pull Request.

@Sentient07
Copy link
Author

Furthermore, this transformation class does not handle cases when targets are empty, leading to this error.

loss_dict = model(images, targets)
  File "/home/envs/tf13/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/envs/tf13/lib/python3.7/site-packages/torchvision/models/detection/generalized_rcnn.py", line 47, in forward
    images, targets = self.transform(images, targets)
  File "/home/envs/tf13/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/envs/tf13/lib/python3.7/site-packages/torchvision/models/detection/transform.py", line 41, in forward
    image, target = self.resize(image, target)
  File "/home/envs/tf13/lib/python3.7/site-packages/torchvision/models/detection/transform.py", line 76, in resize
    bbox = resize_boxes(bbox, (h, w), image.shape[-2:])
  File "/home/envs/tf13/lib/python3.7/site-packages/torchvision/models/detection/transform.py", line 137, in resize_boxes
    xmin, ymin, xmax, ymax = boxes.unbind(1)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

@fmassa
Copy link
Member

fmassa commented May 29, 2020

Hi,

Thanks for opening the PR!

Exposing a transform argument to FasterRCNN is a tempting idea, and I think it could be done.

But it means that it would be conflicting with the following arguments

min_size=800, max_size=1333,
image_mean=None, image_std=None,

I think it could be fine if we do proper error checking. This means we would need to change the default values to be None, and then do

if transform is not None:
    if min_size is not None:
        raise ValueError(...)
    if max_size is not None:
        raise ValueError(...)
else:
    transform = ...

That being said, after the model is instantiated, it is fairly easy to change the transform by doing something like

model = fasterrcnn_resnet50_rpn()
model.transform = MyTransform()

but I'm not against exposing this to the user.


About the error message, I believe this is an issue with the targets being passed to the model, and we have added some informative error message in #2207

Let us know if this isn't the case, IIRC we are still missing error checks for masks and keypoints

@Sentient07
Copy link
Author

Sentient07 commented May 31, 2020

Hi Francisco,

Thank you for your response. Why would we need the min_size, max_size, image_mean, image_std param separately? Could this not be integrated with GeneralizedRCNNTransform ? My idea is the user passing something like,

frcnn = FasterRCNN(backbone, transform=GeneralizedRCNNTransform(min_size, max_size, image_mean, image_std))

Edit : (continuation)

I think this type of parametrisation would be compatible with Visions'/Albumentation's transformation as well. Please let me know what you think

Regards,

@fmassa
Copy link
Member

fmassa commented Jun 1, 2020

@Sentient07 we already support passing min_size etc to the model. So what should we do if the user specifies the model as follows:

model fasterrcnn_resnet50_fpn(transform=MyTransform(), min_size=500)

as you can see, now min_size is unused by the model, and this can lead to subtle errors on the user side because they will change min_size but won't see any difference in predictions (because the predictions are always rescaled back to the original image size)

@Sentient07
Copy link
Author

Hi @fmassa ,
I don't understand your question,

So what should we do if the user specifies the model as follows:
model fasterrcnn_resnet50_fpn(transform=MyTransform(), min_size=500)

If they want to specify min_size, shouldn't it go inside MyTransform() which would take care of their expectations?. Since min_size, max_size, image_mean, image_std parameters aren't used outside of GeneralizedRCNNTransform , I think we can safely remove them. If the users want a custom min_size, max_size, image_mean, image_std parametrisation, they can pass transform=GeneralizedRCNNTransform(**custom_vals) . I will send a PR soon to elaborate what I mean.

Thanks!

@fmassa
Copy link
Member

fmassa commented Jun 5, 2020

@Sentient07 I've answered in the PR, let me know what you think

@Zhylkaaa
Copy link

Zhylkaaa commented Apr 25, 2021

I wonder if it's still interesting to someone, I can give it a try, should I open new PR or continue #2288?

@qornifmlx
Copy link

I believe yes, FASTER-RCNN's job should only be learning and inference, while resizing and others should be other object's responsibilit, I'm having a hard time training and inferencing since with the current implementation everything feels so flimsy and what if I want to inference diference sized images? https://discuss.pytorch.org/t/faster-rcnn-image-size-and-weird-transform-behaviour/121414

@Zhylkaaa
Copy link

Zhylkaaa commented May 17, 2021

ok, so I will try to fix it. For now I am using messy but working workaround (@qornifmlx, maybe this will help you):

model = FasterRCNN(backbone,
                       rpn_anchor_generator=anchor_generator,
                       box_roi_pool=roi_pooler,
                       box_head=box_head,
                       box_predictor=box_predictor,
                       image_mean=(0,),
                       image_std=(1,),
                       max_size=1000,
                       min_size=220) 

    # calculate mean and std from data

    transform = T.Normalize(image_mean=(mean,), image_std=(std,))
    model.transform = transform

Normalize was just ripped from torchvision/models/detection/transform from GeneralizedRCNNTransform class
But I guess, that you are already using some kind of preprocessing so you can leave this with image_mean=0 and image_std=1
but there should be something in transform and it should have

 def postprocess(self,
                    result,               # type: List[Dict[str, Tensor]]
                    image_shapes,         # type: List[Tuple[int, int]]
                    original_image_sizes  # type: List[Tuple[int, int]]
                    ):

because it is called in line 99 in generalized RCNN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants