-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Remove non-functional Transforms from presets #4952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 322cf9e (more details on the Dr. CI page): ✅ None of the CI failures appear to be your fault 💚
🚧 1 fixed upstream failure:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few clarifications to assist review:
@@ -62,7 +62,7 @@ def _video_resnet( | |||
class R3D_18Weights(Weights): | |||
Kinetics400_RefV1 = WeightEntry( | |||
url="https://download.pytorch.org/models/r3d_18-b3b3357e.pth", | |||
transforms=partial(Kinect400Eval, resize_size=(128, 171), crop_size=(112, 112)), | |||
transforms=partial(Kinect400Eval, crop_size=(112, 112), resize_size=(128, 171)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing order to match the other presets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just for appearance, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, just for styling. :)
if not isinstance(img, Tensor): | ||
img = F.pil_to_tensor(img) | ||
img = F.convert_image_dtype(img, torch.float) | ||
return self._normalize(img) | ||
img = F.normalize(img, mean=self._mean, std=self._std) | ||
return img |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replacing with the functional equivalents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's equivalent. Confirmed by checking the accuracy of models before and after.
vid = F.resize(vid, self._size, interpolation=self._interpolation) | ||
vid = F.center_crop(vid, self._crop_size) | ||
vid = F.convert_image_dtype(vid, torch.float) | ||
vid = F.normalize(vid, mean=self._mean, std=self._std) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reordered the operations. This is more "usual/canonical" order of ops. I'm running tests to confirm the accuracy remains the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The accuracy before and after the change remains the same. The above change is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions inline, otherwise LGTM. Thanks Vasilis!
@@ -62,7 +62,7 @@ def _video_resnet( | |||
class R3D_18Weights(Weights): | |||
Kinetics400_RefV1 = WeightEntry( | |||
url="https://download.pytorch.org/models/r3d_18-b3b3357e.pth", | |||
transforms=partial(Kinect400Eval, resize_size=(128, 171), crop_size=(112, 112)), | |||
transforms=partial(Kinect400Eval, crop_size=(112, 112), resize_size=(128, 171)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just for appearance, right?
crop_size: Tuple[int, int], | ||
resize_size: Tuple[int, int], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, but why do we use Tuple[int, int]
here for the sizes, but plain int
's for ImageNetEval
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. This is how the original recipes have been implemented, so I need to do this here too. If you see the resize operates differently if you specify both dimensions vs only 1. Prior merging we should consider adding unions etc here and cleaning up further.
Reviewed By: NicolasHug Differential Revision: D32694317 fbshipit-source-id: d4a1fe20d54ef4edad46c3dbc4daff2f229bcb10
Fixes partially #4652
cc @datumbox @vfdev-5 @bjuncek