Skip to content

RandomHorizontalFlip can handle both PIL Image and numpy ndarray #141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

varunagrawal
Copy link
Contributor

This PR allows one to use the RandomHorizontalFlip transform on numpy arrays as well.

@fmassa
Copy link
Member

fmassa commented Apr 4, 2017

I wonder if we want to support different transforms on numpy arrays, instead of only on PIL images.
That would imply that all operations should have numpy support, and currently we only accept numpy arrays in conversion transforms.
Plus, note that converting from numpy array to PIL images (and vice-versa) is very efficient because it does not perform any memory copy. I wonder if it's not better to enforce ToPILImage() + RandomHorizontalFlip() in here. Thoughts?

@varunagrawal
Copy link
Contributor Author

So the problem I am having is that conversion of numpy arrays to PIL images requires me to truncate the data to type uint8 (otherwise PIL decides to weep) and thus is causing me to loss precision points on my model. It would be nice to be idiomatic and simply apply the transforms as a transform compose in the dataloader rather than having to manually perform the transformations.

Since the horizontal flip is the easiest way to improve performance, I went ahead and added the numpy compatibility to it. I can do it for other transforms as well but I agree that this needs to be a decision across the board.

@fmassa
Copy link
Member

fmassa commented Apr 4, 2017

Wait, do you have the latest torchvision? Conversions with int16 and int32 should be properly handled since #122 , or is it a bug with PIL that you are facing?

@varunagrawal
Copy link
Contributor Author

varunagrawal commented Apr 4, 2017

I have the latest version of torchvision built from source. This is more of a problem of converting from float64 to int32 in PIL.

Also, I didn't realize there was a ToPILImage transform. I should try it and see what results I get. Maybe this PR won't be necessary then.

@varunagrawal
Copy link
Contributor Author

Just confirmed. The ToPILImage transform does not handle float64 or float32 numpy arrays. The issue seems to be here. For a 3 channeled ndarray image, there is only a check for uint8 which goes back to my old problem of truncating from float64 to uint8 leading to lower performance.

@fmassa
Copy link
Member

fmassa commented Apr 6, 2017

Hum... can you tell me how you load your images (are they saved in image format or you load from binary blobs, like .mat or h5)? Because I'm not sure how well PIL supports float32 images (and I haven't seen a mention of float64 in there, but I didn't look long enough for it).

@varunagrawal
Copy link
Contributor Author

@fmassa terribly sorry. I completely forgot about this. I am loading the raw images and then performing operations as needed.

@alykhantejani
Copy link
Contributor

@varunagrawal The ToPILImage transform now supports conversion from np.float32 types. Additionally, the ToTensor transform eventually transforms the Image to a float tensor which is also a 32-bit type.

I'm not sure support for numpy arrays in transforms is necessary at this point as you can keep 32-bit precision throughout the transforms. @fmassa wdyt?

@fmassa
Copy link
Member

fmassa commented Oct 2, 2017

I think we should avoid using scipy.misc.imresize as it is deprecated. Furthermore, internally it converts to a PIL image anyway.

If the ToTensor/ToPILImage is not yet working as you need, I think we should instead fix ToPILImage/ToTensor to handle the missing cases, but I'd prefer to keep using PIL images for the moment, as it stores more information than numpy arrays (colorspace for e.g.).

Maybe a possibility is to get inspiration from toimage from scipy to complement ToPILImage?

@alykhantejani
Copy link
Contributor

Closing this for now, but please send a PR for ToTensor/ToPILImage if they need to change to fit your needs

@varunagrawal
Copy link
Contributor Author

@alykhantejani since ToPILImage now supports np.float32, this PR is redundant.

rajveerb pushed a commit to rajveerb/vision that referenced this pull request Nov 30, 2023
* remove stuff that isn't part of resnet

* Port changes from TensorFlow official models version of ResNet to the
MLPerf reference.

add gitignore with python suffixes

store log output from resnet run

fix typo

use different model dirs for different seeds (convenient for multiple runs on the same machine)

remove no-op .take() from input pipeline

put a copy of the compliance dir in this branch. This just makes it slightly less painful to keep these utils synced across branches

* compliance dir fork is not unnecessary

* remove pycache

* cull more pyc files

* minor tweaks

* fix comment

* another comment fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants