You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the fantastic library. I would like to point out an error in the mean and std computation in the torchvision.models page. In particular, I'm referring to the code following the line
"The process for obtaining the values of mean and std is roughly equivalent to:"
import torch
from torchvision import datasets, transforms as T
transform = T.Compose([T.Resize(256), T.CenterCrop(224), T.ToTensor()])
dataset = datasets.ImageNet(".", split="train", transform=transform)
means = []
stds = []
for img in subset(dataset):
means.append(torch.mean(img))
stds.append(torch.std(img)) # Bug here
mean = torch.mean(torch.tensor(means)) # Error here
std = torch.mean(torch.tensor(stds)) # Error here
There are two issues here:
torch.tensor(means) throws an error: ValueError: only one element tensors can be converted to Python scalars. It should be torch.stack(means)
the mean of the standard deviations should rather be a mean of the variances, followed by square rooting it at the end.
Here is a version which fixes these:
import torch
from torchvision import datasets, transforms as T
transform = T.Compose([T.Resize(256), T.CenterCrop(224), T.ToTensor()])
dataset = datasets.ImageNet(".", split="train", transform=transform)
means = []
variances = []
for img in subset(dataset):
means.append(torch.mean(img))
variances.append(torch.std(img)**2)
mean = torch.mean(torch.stack(means), axis=0)
std = torch.sqrt(torch.mean(torch.stack(variances), axis=0))
I would argue further that since we are interested in the per-channel mean and std, we should first compute the mean across all images and then compute the std using this "batch mean" (whereas the current version uses the per-channel mean of the image). I have not run these on a large dataset, so I do not know how much of a difference this would make.
Thank you!
The text was updated successfully, but these errors were encountered:
This is why I used the word "roughly" in the description. This is meant as pseudo-code on how this was computed. I'm okay with adding a torch.stack to make it executable.
the mean of the standard deviations should rather be a mean of the variances, followed by square rooting it at the end.
True, but nothing we can do about it. Unless you have the resources and are willing to offer them for free, retraining all the models to change this is not an option. Believe me, I tried to argue the same thing in #1439. You can find the script I used for figuring out which approach was most likely used here. I encourage you to try your (an objectively better) approach and see if the number change all that much. If that is the case and I don't believe that maybe we can reopen the discussion.
Still, the issue is not actionable. Thus, I'm closing it. Let me know if you have other questions about this.
Hey Philip, thanks for your response! Your investigations are interesting.
I came across this issue because tried computing the mean/std for a different dataset. Would it help to include a comment in the documentation for future users?
📚 Documentation
Hello all,
Thanks for the fantastic library. I would like to point out an error in the mean and std computation in the torchvision.models page. In particular, I'm referring to the code following the line
"The process for obtaining the values of mean and std is roughly equivalent to:"
There are two issues here:
torch.tensor(means)
throws an error:ValueError: only one element tensors can be converted to Python scalars
. It should betorch.stack(means)
Here is a version which fixes these:
I would argue further that since we are interested in the per-channel mean and std, we should first compute the mean across all images and then compute the std using this "batch mean" (whereas the current version uses the per-channel mean of the image). I have not run these on a large dataset, so I do not know how much of a difference this would make.
Thank you!
The text was updated successfully, but these errors were encountered: