Add adjustment operations for RGB Tensor Images. #1525

pedrofreire · 2019-10-25T14:32:00Z

Right now, we have operations on PIL images, but we want to have a version of the operations that act directly on Tensor images, as discussed in issue #1375 .

Here, we add such operations for adjust_brightness, adjust_contrast and adjust_saturation.

In PIL, those functions are implemented by generating a degenerate image from the first, and then interpolating them together.

A few caveats:

Since PIL operates on uint8, and the tensor operations might be on float, we can get slightly different values because of int truncation; in particular, in testing, we check that the images are equal up to a small difference.
We assume here that the images are RGB; in particular, to handle an alpha channel, we would need to check whether the image has an alpha, in which case we copy it in the degenerate image.
adjust_contrast and adjust_saturation rely on grayscale images; since the pull request [WIP] Add Scriptable Transform: Grayscale #1505 is working on it, I made a temporary function, _rgb_to_grayscale, to be replaced once [WIP] Add Scriptable Transform: Grayscale #1505 is done.

Right now, we have operations on PIL images, but we want to have a version of the opeartions that act directly on Tensor images. Here, we add such operations for adjust_brightness, adjust_contrast and adjust_saturation. In PIL, those functions are implemented by generating an degenerate image from the first, and then interpolating them together. - https://github.com/python-pillow/Pillow/blob/master/src/PIL/ImageEnhance.py - https://github.com/python-pillow/Pillow/blob/master/src/libImaging/Blend.c A few caveats: * Since PIL operates on uint8, and the tensor operations might be on float, we can get slightly different values because of int truncation. * We assume here the images are RGB; in particular, to handle an alpha channel, we need to check whether it is present, in which case we copy it to the final image.

codecov-io · 2019-10-25T14:57:28Z

Codecov Report

Merging #1525 into master will increase coverage by <.01%.
The diff coverage is 66.66%.

@@            Coverage Diff             @@
##           master    #1525      +/-   ##
==========================================
+ Coverage   63.86%   63.87%   +<.01%     
==========================================
  Files          83       83              
  Lines        6507     6525      +18     
  Branches     1005     1008       +3     
==========================================
+ Hits         4156     4168      +12     
- Misses       2057     2060       +3     
- Partials      294      297       +3

Impacted Files	Coverage Δ
torchvision/transforms/functional_tensor.py	`62.5% <66.66%> (+5.35%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aadbed6...6eb0b60. Read the comment docs.

fmassa

This looks great, thanks a lot for the PR @pedrofreire !

I made a few comments, they are mostly minor (speed / memory optimizations), and I'd like to have your thoughts about keeping the same dtype as the input tensor.

fmassa · 2019-10-25T14:51:58Z

test/test_functional_tensor.py

+                l1_diff = (ft_img - f_img).norm(p=1)
+                self.assertLess(l1_diff, 0.01 * img.nelement())


What about instead of computing the l1 norm of the difference, we instead take the max of the absolute value?
This gives a metric which is invariant to the number of elements, and we have a clearer idea on what's the error due to the uint8 rounding that we have.
So something like

# find a reasonably good threshold self.assertLess((ft_img - f_img).abs().max(), 0.001)

Plus, I think that if we make the changes I mentioned about keeping the return type of the inputs, we could have 0 error in this function I believe. In this case, we could probably just keep torch.equal there.

This sounds reasonable. From running the tests thousands of times, the highest error I got was 4 / 255; it seems theoretically possible to have a truncation error up 5 / 255 (it seems we have 5 multiplications followed by truncations), so I set that as a bound.

So, I made the returned tensor to have the same dtype, but as it is we still have truncations, since internal additions still happen with floats. The simplest way to resolve this seems to be to do a tensor multiplication per element, convert the dtype, then sum-reduce the tensor.
That makes the code a bit less clear though, and since keeping the truncation doesn't seem to be very important, I kept as it is.
If you think it is worth to have truncation for uint8, I can still add it :)

This is fine, but I'd love to understand the truncations you are mentioning. From the Pillow implementation of Blend, there is some casting to float happening, but maybe this is not the truncation that you mention.

My impression was that the grayscale conversion was doing 3 truncations and I assumed the interpolation was doing 2 - looking at the code, the interpolation only does 1 - which would explain why I only saw errors up to 4 / 255 in float (and up to 2 / 255 in uint8) :)

torchvision/transforms/functional_tensor.py

- We make our operations have input.dtype == output.dtype, at the cost of adding a few type checks and branches. - By using Tensor broadcast, we can simplify the calls to _blend.

It seems Python 2 does not support this type of unpacking, so it broke Python 2 builds. This should fix it.

fmassa

Awesome job, thanks a lot!

fmassa requested changes Oct 25, 2019

View reviewed changes

Pedro Freire added 4 commits October 25, 2019 17:24

Keep dtype and use broadcast in adjust operations

eeedaf9

- We make our operations have input.dtype == output.dtype, at the cost of adding a few type checks and branches. - By using Tensor broadcast, we can simplify the calls to _blend.

Use is_floating_point to check dtype.

eae132f

Remove unpacking in tuple

879f3d8

It seems Python 2 does not support this type of unpacking, so it broke Python 2 builds. This should fix it.

Add from __future__ import division for Python 2

6eb0b60

pedrofreire requested a review from fmassa October 25, 2019 19:41

fmassa approved these changes Oct 26, 2019

View reviewed changes

fmassa merged commit e79cadd into pytorch:master Oct 26, 2019

fmassa mentioned this pull request Oct 29, 2019

[WIP] Add Scriptable Transform: Grayscale #1505

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add adjustment operations for RGB Tensor Images. #1525

Add adjustment operations for RGB Tensor Images. #1525

Uh oh!

pedrofreire commented Oct 25, 2019

Uh oh!

codecov-io commented Oct 25, 2019 •

edited

Loading

Uh oh!

fmassa left a comment

Uh oh!

fmassa Oct 25, 2019 •

edited

Loading

Uh oh!

pedrofreire Oct 25, 2019

Uh oh!

pedrofreire Oct 25, 2019

Uh oh!

fmassa Oct 26, 2019

Uh oh!

pedrofreire Oct 26, 2019 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fmassa left a comment

Uh oh!

Uh oh!

		l1_diff = (ft_img - f_img).norm(p=1)
		self.assertLess(l1_diff, 0.01 * img.nelement())

Add adjustment operations for RGB Tensor Images. #1525

Add adjustment operations for RGB Tensor Images. #1525

Uh oh!

Conversation

pedrofreire commented Oct 25, 2019

Uh oh!

codecov-io commented Oct 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

fmassa Oct 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pedrofreire Oct 25, 2019

Choose a reason for hiding this comment

Uh oh!

pedrofreire Oct 25, 2019

Choose a reason for hiding this comment

Uh oh!

fmassa Oct 26, 2019

Choose a reason for hiding this comment

Uh oh!

pedrofreire Oct 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov-io commented Oct 25, 2019 •

edited

Loading

fmassa Oct 25, 2019 •

edited

Loading

pedrofreire Oct 26, 2019 •

edited

Loading