|
| 1 | +--- |
| 2 | +layout: hub_detail |
| 3 | +background-class: hub-background |
| 4 | +body-class: hub |
| 5 | +title: MNasNet |
| 6 | +summary: Neural Architecture Search based models for Mobile devices |
| 7 | +category: researchers |
| 8 | +image: mnasnet1.png |
| 9 | +author: Pytorch Team |
| 10 | +tags: [vision, scriptable] |
| 11 | +github-link: https://github.com/pytorch/vision/blob/main/torchvision/models/mnasnet.py |
| 12 | +github-id: pytorch/vision |
| 13 | +featured_image_1: mnasnet1.png |
| 14 | +featured_image_2: no-image |
| 15 | +accelerator: cuda-optional |
| 16 | +order: 10 |
| 17 | +--- |
| 18 | + |
| 19 | +```python |
| 20 | +import torch |
| 21 | +model = torch.hub.load('pytorch/vision:v0.10.0', 'mnasnet0_5', pretrained=True) |
| 22 | +# or |
| 23 | +# model = torch.hub.load('pytorch/vision:v0.10.0', 'mnasnet1_0', pretrained=True) |
| 24 | +model.eval() |
| 25 | +``` |
| 26 | + |
| 27 | +All pre-trained models expect input images normalized in the same way, |
| 28 | +i.e. mini-batches of 3-channel RGB images of shape `(3 x H x W)`, where `H` and `W` are expected to be at least `224`. |
| 29 | +The images have to be loaded in to a range of `[0, 1]` and then normalized using `mean = [0.485, 0.456, 0.406]` |
| 30 | +and `std = [0.229, 0.224, 0.225]`. |
| 31 | + |
| 32 | +Here's a sample execution. |
| 33 | + |
| 34 | +```python |
| 35 | +# Download an example image from the pytorch website |
| 36 | +import urllib |
| 37 | +url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg") |
| 38 | +try: urllib.URLopener().retrieve(url, filename) |
| 39 | +except: urllib.request.urlretrieve(url, filename) |
| 40 | +``` |
| 41 | + |
| 42 | +```python |
| 43 | +# sample execution (requires torchvision) |
| 44 | +from PIL import Image |
| 45 | +from torchvision import transforms |
| 46 | +input_image = Image.open(filename) |
| 47 | +preprocess = transforms.Compose([ |
| 48 | + transforms.Resize(256), |
| 49 | + transforms.CenterCrop(224), |
| 50 | + transforms.ToTensor(), |
| 51 | + transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), |
| 52 | +]) |
| 53 | +input_tensor = preprocess(input_image) |
| 54 | +input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model |
| 55 | + |
| 56 | +# move the input and model to GPU for speed if available |
| 57 | +if torch.cuda.is_available(): |
| 58 | + input_batch = input_batch.to('cuda') |
| 59 | + model.to('cuda') |
| 60 | + |
| 61 | +with torch.no_grad(): |
| 62 | + output = model(input_batch) |
| 63 | +# Tensor of shape 1000, with confidence scores over Imagenet's 1000 classes |
| 64 | +print(output[0]) |
| 65 | +# The output has unnormalized scores. To get probabilities, you can run a softmax on it. |
| 66 | +probabilities = torch.nn.functional.softmax(output[0], dim=0) |
| 67 | +print(probabilities) |
| 68 | +``` |
| 69 | + |
| 70 | +``` |
| 71 | +# Download ImageNet labels |
| 72 | +!wget https://github.com/raw/pytorch/hub/master/imagenet_classes.txt |
| 73 | +``` |
| 74 | + |
| 75 | +``` |
| 76 | +# Read the categories |
| 77 | +with open("imagenet_classes.txt", "r") as f: |
| 78 | + categories = [s.strip() for s in f.readlines()] |
| 79 | +# Show top categories per image |
| 80 | +top5_prob, top5_catid = torch.topk(probabilities, 5) |
| 81 | +for i in range(top5_prob.size(0)): |
| 82 | + print(categories[top5_catid[i]], top5_prob[i].item()) |
| 83 | +``` |
| 84 | + |
| 85 | +### Model Description |
| 86 | + |
| 87 | +The MnasNet v3 architecture is based on some stuff. |
| 88 | + |
| 89 | +| Model structure | Top-1 error | Top-5 error | |
| 90 | +| --------------- | ----------- | ----------- | |
| 91 | +| mnasnet0_5 | 26.54 | 8.49 | |
| 92 | +| mnasnet1_0 | 32.26 | 12.51 | |
| 93 | + |
| 94 | + |
| 95 | +### References |
| 96 | + |
| 97 | + - [MnasNet: Platform-Aware Neural Architecture Search for Mobile](https://arxiv.org/abs/1807.11626) |
0 commit comments