-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Remove caching from MNIST and variants #3420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3420 +/- ##
==========================================
+ Coverage 75.13% 75.16% +0.02%
==========================================
Files 105 105
Lines 9722 9735 +13
Branches 1563 1567 +4
==========================================
+ Hits 7305 7317 +12
Misses 1930 1930
- Partials 487 488 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!
I've a couple of comments, let me know what you think
Blocked by #3443 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks a ton!
Summary: * remove caching from (Fashion|K)?MNIST * remove unnecessary lazy import * remove false check of binaries against the md5 of archives * remove caching from EMNIST * remove caching from QMNIST * lint * fix EMNIST * streamline QMNIST download Reviewed By: fmassa Differential Revision: D27127995 fbshipit-source-id: 3f53be72b5e7c8abe191edb1e4467e3ef33741dd
Closes #3555
MNIST is one of the oldest datasets in
torchvision
. Back than we opted to cache the data in a custom binary for speed reasons. The caching happens in thedownload()
method:vision/torchvision/datasets/mnist.py
Lines 149 to 164 in 22c548b
This makes them harder to test: we need to run this method in each test if we provide a mock of the original binaries, but the new testcases from #3402 were designed to handle the
download
flag in a special way.Since the speed difference is not significant anymore, we can safely remove the caching.
[EQ]MNIST
. I've tested everything locally, but please review them extra carefully.