-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New tests for ImageNet dataset #3543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for the feedback @pmeier , I think the failures are unrelated and this is ready for reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three minor comments. Otherwise LGTM!
test/datasets_utils.py
Outdated
special_kwargs["download"] = False | ||
special_kwargs["download"] = None if self.DATASET_CLASS.__name__ == 'ImageNet' else False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? With download=False
the dataset will emit a warning, but should not behave anything different. Given that this warning is emitted for a long time now, I wonder if we can remove the download
flag from ImageNet
all together.
If we really want to avoid, I suggest we change L316 to honor a explicitly passed download
special_kwargs.setdefault("download", False)
and overwrite create_dataset
in the ImageNetTestCase to always pass download=None
@contextlib.contextmanager
def create_dataset(self, *args, **kwargs):
kwargs.setdefault("download", None)
with super().create_dataset(*args, **kwargs) as (dataset, info):
yield dataset, info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and overwrite create_dataset in the ImageNetTestCase to always pass download=None
I went for a simpler solution which is to only override the default if download
exists and if its default is True-y. This way, when the default is False or None, it doesn't get overridden.
This still avoids hardcoding the 'ImageNet' name which I believe was the main issue here
test/test_datasets.py
Outdated
root=tmpdir, | ||
name=tmpdir / 'train' / wnid / wnid, | ||
file_name_fn=lambda image_idx: f"{wnid}_{image_idx}.JPEG", | ||
num_examples=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Instead of hard coding it here, maybe set num_examples
within each branch and simply return this. Bonus: If you change the number of examples dependent on the split you get a little better "coverage" with little cost.
test/test_datasets.py
Outdated
@@ -490,6 +480,35 @@ def inject_fake_data(self, tmpdir, config): | |||
return num_images_per_category * len(categories) | |||
|
|||
|
|||
class ImageNetTestCase(datasets_utils.ImageDatasetTestCase): | |||
DATASET_CLASS = datasets.ImageNet | |||
REQUIRED_PACKAGES = ['scipy'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Although this should not make a difference here, one should always avoid mutable class attributes unless this explicitly needed.
Thanks for the review! |
Reviewed By: fmassa Differential Revision: D27127989 fbshipit-source-id: c21ba8a29c71a4bb9efa4bb1ab8713c3a9809842
This PR ports the ImageNet tests to the new test infrastructure.
Addresses part of #3531