Allow torchvision.io to pass through ToTensor() #2959

jgbradley1 · 2020-11-04T03:56:27Z

🚀 Ensure torchvision.io is a drop-in replacement with current workflows

The following snippet will fail.

img = torchvision.io.read_image()
img = torchvision.transforms.ToTensor()(img)

Pitch

Consider making native io compatible with existing transform workflows by allowing the tensor type to pass through ToTensor(). This would still scale down tensor values to the range 0-1 and not impact downstream transformations.

The text was updated successfully, but these errors were encountered:

vfdev-5 · 2020-11-04T08:34:42Z

@jgbradley1 img = torchvision.io.read_image() gives a tensor, right ? Why would you like to transform tensor to tensor ?
If you would like to scale values, please use ConvertImageDtype. What do you think ?

jgbradley1 · 2020-11-04T17:12:32Z

ConvertImageDtype would work. The actual issue that I am referring to in this post is a conflict with the design choice and the impact it will have going forward. The first question I should ask, for clarification, is whether or not you all want torchvision.io to be treated as a drop-in replacement for PIL image loading?

A long time ago, someone decided that ToTensor would convert PIL to tensor AND scale values down to the 0-1 range (despite the function name not being explicit about it) . For better or worse, the standard practice by many in the community is now to regularly place ToTensor in a composition of transforms and assume the scaling is being performed.

So do you let ToTensor accept tensors so people aren't forced to update multiple parts of their DL pipelines when switching to torchvision.io...or cause a breaking change in the design and force people to add a new transform all because they switched to torchvision.io?

Since most of the transforms accept PIL or tensor type, it makes sense to me under the current design to allow tensors to pass through ToTensor.

jgbradley1 · 2020-11-04T18:23:30Z

Perhaps I'm just way off-base with this line of thinking. ToTensor is documented as a conversion transform so I might be over-stating how often people depend on ToTensor for value scaling purposes.

I'm okay if the decision is

people should read the documentation and make necessary changes to use torchvision.io correctly

fmassa · 2020-11-04T18:25:56Z

Hi,

Thanks for sparkling this discussion.

Even though the current default is to rely on ToTensor, it has multiple issues with it due to its implicit conversion to 0-1 range.
This is even more pronounced when using numpy arrays internally, and has led to a number of issues, see #546 and #1595 for some examples.
I believe the core issue is that ToTensor tries to do too many things at once, and trying to fix all corner cases will brings us to a rabbit hole.

My current thinking is to move users away from ToTensor, and instead ask them to use more self-contained functionality such as ConverImageDtype and PILToTensor if they want to use torchvision.io.
But we won't be removing ToTensor because it's such a widely used functionality, although we would like to encourage a different usage pattern.

If we were to change ToTensor to perform this implicit scaling (for convenience to users), I would prefer if we raised a warning in this use-case, with a message so that this behavior is not encouraged.
This way users could give this a try and see how it fits with their pipeline, but having warnings to remember them that this is not what we would encourage them to use.

Thoughts?

jgbradley1 · 2020-11-09T03:36:42Z

I like the suggestion to add a warning. While this request is not really a technical bug, it would help the community evolve and push everyone in the right direction away from using ToTensor. This will only happen as people stop using a PIL/numpy image loader and torchvision.io matures as the defacto standard.

Just summarizing what this request involves then. ToTensor shall:

accept the output of torchvision.io.read_image, a uint8 tensor torch.ByteTensor in the range [0, 255]
convert to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
warn and encourage users to switch to the more self-contained functionality of ConvertImageDtype for processing image conversion and loading operations.

fmassa · 2020-11-18T11:11:15Z

@jgbradley1 I think I would be fine with what you propose, but let's hear what @vfdev-5 and @datumbox think as well.

datumbox · 2020-11-19T10:17:29Z

I'm on the fence on this one. On one hand, the proposal looks good and seeks to extend an existing functionality that could be useful to the users. On the other hand, implementing the improvement might reinforce using this problematic function. Since we are actively talking about moving users off the specific method, is it worth considering just adding a warning on the existing code and avoid adding any further extensions?

My concern stems from the fact that standardization/preprocessing is an important part of training/inference that should be handled explicitly by the user. This is particularly true in transfer learning where if you mess this up you can get really poor results. I suspect that with the current setup, many users might be calling the ToTensor() before passing it to a pre-trained model without realizing it scales down the values.

vfdev-5 · 2020-11-19T12:13:30Z

I also think passing uint8 tensor into ToTensor in order to convert it to float and rescale is somewhat unexpected from ToTensor method name.

fmassa · 2020-11-20T09:46:25Z

Ok, I'm convinced with @datumbox and @vfdev-5 comments, so I think in order to get users to try this out we should instead replace examples / tutorials with the new code.

As such, I'll be closing this issue, but thanks again for the discussion and let us know if you have any more thoughts / comments.

jgbradley1 changed the title ~~Make ToTensor() compatible with torchvision.io~~ Allow torchvision.io to pass through ToTensor() Nov 4, 2020

vfdev-5 added needs discussion question labels Nov 4, 2020

fmassa closed this as completed Nov 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow torchvision.io to pass through ToTensor() #2959

Allow torchvision.io to pass through ToTensor() #2959

jgbradley1 commented Nov 4, 2020 •

edited

Loading

vfdev-5 commented Nov 4, 2020

jgbradley1 commented Nov 4, 2020 •

edited

Loading

jgbradley1 commented Nov 4, 2020

fmassa commented Nov 4, 2020 •

edited

Loading

jgbradley1 commented Nov 9, 2020

fmassa commented Nov 18, 2020

datumbox commented Nov 19, 2020

vfdev-5 commented Nov 19, 2020

fmassa commented Nov 20, 2020

Allow torchvision.io to pass through ToTensor() #2959

Allow torchvision.io to pass through ToTensor() #2959

Comments

jgbradley1 commented Nov 4, 2020 • edited Loading

🚀 Ensure torchvision.io is a drop-in replacement with current workflows

Pitch

vfdev-5 commented Nov 4, 2020

jgbradley1 commented Nov 4, 2020 • edited Loading

jgbradley1 commented Nov 4, 2020

fmassa commented Nov 4, 2020 • edited Loading

jgbradley1 commented Nov 9, 2020

fmassa commented Nov 18, 2020

datumbox commented Nov 19, 2020

vfdev-5 commented Nov 19, 2020

fmassa commented Nov 20, 2020

jgbradley1 commented Nov 4, 2020 •

edited

Loading

jgbradley1 commented Nov 4, 2020 •

edited

Loading

fmassa commented Nov 4, 2020 •

edited

Loading