-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Use real weight and image for classification model test and relaxing precision requirement for general model tests #7130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Would it make sense to go with a middle ground here? I see we had |
Looking at the errors, seems like |
…ion into test/relaxing-precision
…e image_size as before
@NicolasHug @pmeier This PR resolve most problem on the model test. From what I see the remaining problem is on
I think this is a core issue and I create an issue for this (I can't reproduce this error on AWS cluster, and setting CUDA_LAUNCH_BLOCKING=1 didn't really give a better error trace as well). Also note that the error on test in python 3.7 is not relevant, seems like pytorch core plan to deprecate python 3.7. (which should be fixed after #7110) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Yosua, I left a few comments, LMK what you think
else: | ||
H, W = input_shape[-2:] | ||
min_side = min(H, W) | ||
preprocess = weights.transforms(resize_size=min_side, crop_size=min_side) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to pass parameters to the weights.transforms()
, they will handle the size properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this if we want to control the size when the test happened, otherwise we will rely on the default size on the weight transforms (In some big model, we would like to use smaller image size for the test to speed up runtime).
Note: For test purpose, I think it is okay not to use the preferred
image size that will yield the best accuracy for the model.
@@ -51,19 +51,26 @@ def _get_image(input_shape, real_image, device): | |||
|
|||
img = Image.open(GRACE_HOPPER) | |||
|
|||
original_width, original_height = img.size | |||
if weights is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just pass the weights all the time? What's the reason for having them in only some cases but not all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some cases the weight are really restrictive, for instance if we use vit_h_14
, it will only accept the image_size of the size of the min_size of the weight: https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py#L321 and in this case we can't do the test with lower resolution with the weight.
Also as of now, we dont use real weight for detection model test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, it will only accept the image_size of the size of the min_size of the weight: https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py#L321 and in this case we can't do the test with lower resolution with the weight
But isn't that a good thing? i.e. if we go below the min_size
limit, wouldn't we expect the model to output garbage? And if not, why is the limit not lower?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For test purpose, we might want to use smaller image even if the output is garbage but we can still check for the consistentcy (what we did so far with random image and random weight). And in this case if we set weight=None
then it will basically behave like before, the get_image will assume that the test dont use real weight but rather initialized with random weight.
@@ -364,7 +376,8 @@ def _check_input_backprop(model, inputs): | |||
"s3d": { | |||
"input_shape": (1, 3, 16, 224, 224), | |||
}, | |||
"googlenet": {"init_weights": True}, | |||
"regnet_y_128gf": {"weight_name": "IMAGENET1K_SWAG_LINEAR_V1"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just get the actual weights from the model name, using the helpers from https://pytorch.org/vision/main/models.html#model-registration-mechanism ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, I actually use the helper to get the actual weight in here.
I think I prefer this design where we dont need to specify the weight_enum
for the weight_name
(since it can be retrieve from the model_name
). Also, it is easier to say that the default value that we use is IMAGENET1K_V1
for the test.
…pected to num_classes_to_check
…ion into test/relaxing-precision
From investigation (see #7114 (comment)), it seems that our model test is sensitive to machine type. And after testing on AWS cluster, it seems to start failing since the PR #6380 in AWS cluster machine. To fix this, we should revert the PR and relax the precision criteria.
Update:
In order to make the model test green, we do the following in this PR:
cc @pmeier