Skip to content

ImageNet Dataset 404 Not Found ARCHIVE_DICT URLs #1453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
awwong1 opened this issue Oct 12, 2019 · 10 comments
Closed

ImageNet Dataset 404 Not Found ARCHIVE_DICT URLs #1453

awwong1 opened this issue Oct 12, 2019 · 10 comments

Comments

@awwong1
Copy link

awwong1 commented Oct 12, 2019

All three URLs listed in the ARCHIVE_DICT are returning 404 not found errors.

$ wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_train.tar
--2019-10-12 16:38:39--  http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_train.tar
Resolving www.image-net.org (www.image-net.org)... 171.64.68.16
Connecting to www.image-net.org (www.image-net.org)|171.64.68.16|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2019-10-12 16:38:39 ERROR 404: Not Found.

$ wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar
--2019-10-12 16:38:51--  http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar
Resolving www.image-net.org (www.image-net.org)... 171.64.68.16
Connecting to www.image-net.org (www.image-net.org)|171.64.68.16|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2019-10-12 16:38:51 ERROR 404: Not Found.

$ wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_devkit_t12.tar.gz
--2019-10-12 16:39:01--  http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_devkit_t12.tar.gz
Resolving www.image-net.org (www.image-net.org)... 171.64.68.16
Connecting to www.image-net.org (www.image-net.org)|171.64.68.16|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2019-10-12 16:39:03 ERROR 404: Not Found.

Current ImageNet dataset class is no longer functioning as expected.

@pmeier
Copy link
Collaborator

pmeier commented Oct 14, 2019

From the ImageNet blog:

While conducting our study, since January 2019 we have disabled downloads of the full ImageNet data, except for the small subset of 1,000 categories used in the ImageNet Challenge.

Thus, we should be able to download. Maybe they closed the links since they were accessible without their authorization. I'm not sure how to proceed here, @fmassa?

@fmassa
Copy link
Member

fmassa commented Oct 14, 2019

Yes, they seem to have removed download access without filling the form http://image-net.org/download-images.

In this case, I think there is no better alternative than removing the download link and the download functionality, and requesting the user to download the images themselves.

cc @soumith for awareness

@pmeier
Copy link
Collaborator

pmeier commented Oct 14, 2019

I'll send a PR later.

@pmeier
Copy link
Collaborator

pmeier commented Oct 21, 2019

This is closed by #1457

@fmassa fmassa closed this as completed Oct 21, 2019
@fmassa
Copy link
Member

fmassa commented Oct 21, 2019

Thanks for fixing it @pmeier !

@jph00
Copy link

jph00 commented Oct 30, 2019

FYI the Imagenet competition dataset is also available thru Kaggle.

@fmassa
Copy link
Member

fmassa commented Oct 30, 2019

@jph00 can we consider it as an official version of the dataset?

@jph00
Copy link

jph00 commented Oct 30, 2019

I don't know what your guidelines are for that @fmassa . It's officially shared with Kaggle by the Imagenet consortium AFAIK, and it certainly has the same contents (I've checked).

@RylanSchaeffer
Copy link

RylanSchaeffer commented Nov 18, 2019

@pmeier , is there a way to download the 1000-class ImageNet subset through PyTorch?

@fmassa maybe you know?

@pmeier
Copy link
Collaborator

pmeier commented Nov 19, 2019

No, there isn't. As the disclaimer of torchvision states:

This is a utility library that downloads and prepares public datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have license to use the dataset. It is your responsibility to determine whether you have permission to use the dataset under the dataset's license.

IMO we were already in a grey area before, since we were hosting download links that should have not been publicly accessible. Since the authors decided to close these links, there is nothing we can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants