-
Notifications
You must be signed in to change notification settings - Fork 9.7k
the error when I run the example for the imagenet #544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello, perhaps you know how to download the ImageNet dataset for this program to use? |
Maybe the number of classes in your datasets is not 1000, so you should change it... like this: class TotalModel(nn.Module):
def __init__(self, num_class=1000):
super(TotalModel, self).__init__()
net = resnet50(pretrained=True)
self.div_32 = nn.Sequential(*list(net.children())[:-1])
self.other_layers = nn.Linear(2048, num_class)
def forward(self, in_feat):
in_feat = self.div_32(in_feat)
in_feat = in_feat.view(in_feat.size(0), -1)
in_feat = self.other_layers(in_feat)
return in_feat
if __name__ == '__main__':
in_data = torch.randn(4, 3, 224, 224)
net = TotalModel()
out = net(in_data)
print(out.size()) |
I have the same issue. Is there anyone who solved this issue? Please help me. |
@lartpang seems to have the correct suggestion |
When I tried to run the model for the example/imagenet, I encounter such error.So could you tell me how to solve the problem?
python /home/zrz/code/imagenet_dist/examples-master/imagenet/main.py -a resnet18 -/home/zrz/dataset/imagenet/imagenet2012/ILSVRC2012/raw-data/imagenet-data
=> creating model 'resnet18'
Epoch: [0][ 0/320292] Time 3.459 ( 3.459) Data 0.295 ( 0.295) Loss 7.2399e+00 (7.2399e+00) Acc@1 0.00 ( 0.00) Acc@5 0.00 ( 0.00)
Epoch: [0][ 10/320292] Time 0.043 ( 0.357) Data 0.000 ( 0.027) Loss 9.4861e+00 (1.3169e+01) Acc@1 0.00 ( 0.00) Acc@5 0.00 ( 0.00)
Epoch: [0][ 20/320292] Time 0.046 ( 0.209) Data 0.000 ( 0.014) Loss 7.3722e+00 (1.0817e+01) Acc@1 0.00 ( 0.00) Acc@5 0.00 ( 0.00)
Epoch: [0][ 30/320292] Time 0.032 ( 0.154) Data 0.000 ( 0.010) Loss 6.9166e+00 (9.5394e+00) Acc@1 0.00 ( 0.00) Acc@5 0.00 ( 0.00)
/opt/conda/conda-bld/pytorch_1549630534704/work/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [3,0,0] Assertion
t >= 0 && t < n_classes
failed.Traceback (most recent call last):
File "/home/zrz/code/imagenet_dist/examples-master/imagenet/main.py", line 417, in
File "/home/zrz/code/imagenet_dist/examples-master/imagenet/main.py", line 113, in main
File "/home/zrz/code/imagenet_dist/examples-master/imagenet/main.py", line 239, in main_worker
File "/home/zrz/code/imagenet_dist/examples-master/imagenet/main.py", line 286, in train
RuntimeError: CUDA error: device-side assert triggered
terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: device-side assert triggered (insert_events at /opt/conda/conda-bld/pytorch_1549630534704/work/aten/src/THC/THCCachingAllocator.cpp:470)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f099a50acf5 in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: + 0x123b8c0 (0x7f099e7ee8c0 in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libcaffe2_gpu.so)
frame #2: at::TensorImpl::release_resources() + 0x50 (0x7f099ac76c30 in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #3: + 0x2a836b (0x7f099818b36b in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #4: + 0x30eff0 (0x7f09981f1ff0 in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #5: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x2f0 (0x7f099818dd70 in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #6: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7f09c17f87f5 in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #7: torch::autograd::Variable::Impl::release_resources() + 0x4a (0x7f09984001ba in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #8: + 0x12148b (0x7f09c181048b in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #9: + 0x31a49f (0x7f09c1a0949f in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #10: + 0x31a4e1 (0x7f09c1a094e1 in /home/zrz/miniconda3/envs/runze_env_name/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #11: + 0x1993cf (0x5574e4c9a3cf in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #12: + 0xf12b7 (0x5574e4bf22b7 in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #13: + 0xf1147 (0x5574e4bf2147 in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #14: + 0xf115d (0x5574e4bf215d in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #15: + 0xf115d (0x5574e4bf215d in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #16: + 0xf115d (0x5574e4bf215d in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #17: PyDict_SetItem + 0x3da (0x5574e4c37e7a in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #18: PyDict_SetItemString + 0x4f (0x5574e4c4078f in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #19: PyImport_Cleanup + 0x99 (0x5574e4ca4709 in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #20: Py_FinalizeEx + 0x61 (0x5574e4d105f1 in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #21: Py_Main + 0x35e (0x5574e4d1b1fe in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #22: main + 0xee (0x5574e4be402e in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
frame #23: __libc_start_main + 0xf5 (0x7f09d9c2e3d5 in /lib64/libc.so.6)
frame #24: + 0x1c3e0e (0x5574e4cc4e0e in /home/zrz/miniconda3/envs/runze_env_name/bin/python3.6)
The text was updated successfully, but these errors were encountered: