Fix hardcoded 255 #6830

pmeier · 2022-10-24T17:37:58Z

pmeier · 2022-10-24T17:39:31Z

torchvision/prototype/transforms/functional/_color.py

@@ -226,19 +226,15 @@ def adjust_hue_image_tensor(image: torch.Tensor, hue_factor: float) -> torch.Ten
        return image

    orig_dtype = image.dtype
-    if image.dtype == torch.uint8:
-        image = image / 255.0


Instead of doing the conversion manually, I've opted to use our kernel for this. Note that this also implicitly converts to float32 since the divisor is a float.

pmeier · 2022-10-24T17:40:01Z

torchvision/transforms/functional_tensor.py

@@ -15,12 +15,6 @@ def _assert_image_tensor(img: Tensor) -> None:
        raise TypeError("Tensor is not a torch image.")


-def _assert_threshold(img: Tensor, threshold: float) -> None:


This was only used once so I inlined it.

pmeier · 2022-10-25T07:07:31Z

test/test_functional_tensor.py

    F_t.solarize(img, threshold)


 @pytest.mark.parametrize("device", cpu_and_gpu())
 @pytest.mark.parametrize("threshold", [260])
 def test_solarize_threshold2_upper_bound(threshold, device):
-    img = torch.randint(0, 256, (3, 12, 23)).to(device)
+    img = torch.randint(0, 256, (3, 12, 23), dtype=torch.uint8, device=device)


torch.randint is a int64 by default which will no longer trigger an error for a threshold of 260.

Instead of just fixing this here, I opted to make the tests more robust in e13613a.

datumbox · 2022-10-25T08:02:23Z

@pmeier Thanks for the PR. Have you made any measurements on the V2 changes to check if there is a performance degradation? There are a few more 255 hardcoded values left but those are in places where we support only uint8. Is there a plan to update them?

Finally this change, though correct, has the potential of breaking existing code. Before merging we might need to cherrypick it and bring it on FBcode to see if there is any breakage.

pmeier · 2022-10-25T08:41:02Z

Have you made any measurements on the V2 changes to check if there is a performance degradation?

Not yet. For all ops except for adjust_hue we only replace one if-else with a function call that includes an if-elif-else. This change should be in the nano second range and thus well within our measuring tolerance. Do you also want me to benchmark all of them?

There are a few more 255 hardcoded values left but those are in places where we support only uint8. Is there a plan to update them?

Nope, I don't see a point. Or do you mean have another look at the kernel whether or not we are doing the wrong thing, i.e. nothing, for other integer dtypes? In that case, yes that would be a good idea.

pmeier · 2022-10-25T11:03:17Z

There are two places with hardcoded 255 left:

vision/torchvision/transforms/functional_tensor.py

Lines 471 to 472 in 7de68b0

if interpolation == "bicubic" and out_dtype == torch.uint8:

img = img.clamp(min=0, max=255)

Two things here:
1. Although the 255 is fine here for torch.uint8, all other integer images will not be clamped and thus might fail the following conversion back to an integer dtype due to overflow. Thus, this should be something like
```
if interpolation == "bicubic" and not out_dtype.is_floating_point:
    img = img.clamp(min=0, max=_max_value(out_dtype)) 
```
1. That being said, I'm a little confused why we are only clamping only for integer dtypes in the first place. Bicubic interpolation can lead to overflowing values, so shouldn't we clamp regardless of the dtype? Otherwise, the value range [0.0, 1.0] is no longer guaranteed after this operation. Thus, I think this should be
```
if interpolation == "bicubic":
    img = img.clamp_(0, _max_value(out_dtype))
```
Maybe @vfdev-5 can shed some light here.

vision/torchvision/transforms/functional_tensor.py

Lines 870 to 873 in 7de68b0

    
           if img_chan.is_cuda: 
        
               hist = torch.histc(img_chan.to(torch.float32), bins=256, min=0, max=255) 
        
           else: 
        
               hist = torch.bincount(img_chan.reshape(-1), minlength=256)

which is guarded by

vision/torchvision/transforms/functional_tensor.py

Lines 896 to 897 in 7de68b0

    
           if img.dtype != torch.uint8: 
        
               raise TypeError(f"Only torch.uint8 image tensors are supported, but found {img.dtype}")

Meaning, equalize does not work with any dtype other than torch.uint8 and thus the hardcoded 255 is fine.

That being said, I think we need to have discussion whether or not we want to have kernels that categorically only work with a subset of dtypes, in the extreme case like here, only for a single dtype. That means for example that the AA transforms that use equalize internally cannot be used with floating point images. Their docstring states that much

vision/torchvision/transforms/autoaugment.py

Lines 104 to 107 in 7de68b0

    
           class AutoAugment(torch.nn.Module): 
        
               r"""AutoAugment data augmentation method based on 
        
               `"AutoAugment: Learning Augmentation Strategies from Data" <https://arxiv.org/pdf/1805.09501.pdf>`_. 
        
               If the image is torch Tensor, it should be of type torch.uint8, and it is expected

but we should discuss whether or not we are ok with this behavior going forward. This should not happen on this PR and I will open an issue soon about this.

datumbox · 2022-10-25T11:14:39Z

@pmeier As discussed offline, checking only the methods that change significantly will do. No need to benchmark those that just fetch the value from the dictionary. Let's wait for Victor's thoughts on this.

@fmassa I was wondering if you could chime in as well. I'm supportive of Philip's change, just wanted to make sure we don't miss something important. The TLDR is, we replace the hardcoded 255 values with the ones for each dtype. Floats and uint8 remain unaffected but other integer types would change. This would align the behaviour of the kernels with convert_image_dtype which currently use different max values. I think this can be considered a bug but not 100% if it was originally intentional.

pmeier · 2022-10-25T11:30:52Z

[--------------------------- adjust_hue ---------------------------]
                                       |  main  |  fix-hardcoded-255
1 threads: ---------------------------------------------------------
      (3, 512, 512), uint8, cpu        |   14   |          15       
      (3, 512, 512), uint8, cuda       |    1   |           1       
      (3, 512, 512), float32, cpu      |   14   |          14       
      (3, 512, 512), float32, cuda     |    1   |           1       
      (5, 3, 512, 512), uint8, cpu     |   94   |          90       
      (5, 3, 512, 512), uint8, cuda    |    7   |           8       
      (5, 3, 512, 512), float32, cpu   |   88   |          84       
      (5, 3, 512, 512), float32, cuda  |    7   |           7       

Times are in milliseconds (ms).

No changes apart from noise.

datumbox · 2022-10-27T08:26:02Z

I've imported this PR on FBcode to check if there are any breakages at D40752944

datumbox · 2022-10-28T08:11:20Z

I ran all the tests internally and it seems the change didn't break anything. There are a lot of pre-existing failures and skipped tests, so we can't be 100% sure. But it looks like it's mostly OK.

@pmeier Do we need to update the PR to cover your recently changes on the 2 kernels?

…6859)" This reverts commit 436ff9a.

…into fix-hardcoded-255

torchvision/prototype/transforms/functional/_color.py

fmassa · 2022-11-03T09:13:00Z

Hi @datumbox

I'm supportive of this change. We didn't have good support for other dtypes before, so assuming either uint8 or float was okaish. Happy to see this being improved!

datumbox · 2022-11-03T09:44:19Z

@pmeier Looks like we should be good to go once you finish the remaining hardcoded values. Ping me when you are happy to do one final review and merge. I've already ported this internally and it looks there are no issues.

torchvision/prototype/transforms/functional/_geometry.py

torchvision/transforms/functional_tensor.py

This reverts commit 5f33f4a.

datumbox

LGTM, I highlighted 2 places were we should measure performance.

torchvision/prototype/transforms/functional/_color.py

datumbox · 2022-11-03T12:00:43Z

torchvision/prototype/transforms/functional/_color.py

-        return (1 if image.is_floating_point() else 255) - image  # type: ignore[no-any-return]
+    else:  # signed integer dtypes
+        # We can't use `Tensor.bitwise_not` here, since we want to retain the leading zero bit that encodes the sign
+        return image.bitwise_xor((1 << _num_value_bits(image.dtype)) - 1)


Can you provide benchmarks for this?

Nice! I like it when bug/code-quality fixing leads to speed improvements. What more can we ask? 😄

pmeier · 2022-11-03T12:11:42Z

There was some offline discussion whether or not we want to remove the implicit assumption that floating point images have the maximum value 1.0. Here are some benchmarks:

[-------------- convert_dtype_image_tensor float32 -> float64 ---------------]
                                      |        main       |  fix-hardcoded-255
1 threads: -------------------------------------------------------------------
      (3, 512, 512), float32, cpu     |    81 (+-  0) us  |    151 (+-  1) us 
      (5, 3, 512, 512), float32, cpu  |  1406 (+- 58) us  |   2183 (+-143) us 

Times are in microseconds (us).

float to float will be slower because we need to perform an additional (inplace) multiplication whereas before a dtype conversion was sufficient. However, if we assume that each float has the same value range, we don't need to touch this conversion at all.

[--------------- convert_dtype_image_tensor float32 -> uint8 ----------------]
                                      |        main       |  fix-hardcoded-255  
1 threads: -------------------------------------------------------------------
      (3, 512, 512), float32, cpu     |   414 (+- 37) us  |    408 (+-  3) us   
      (5, 3, 512, 512), float32, cpu  |  2326 (+-182) us  |   2208 (+- 34) us   

Times are in microseconds (us).

If anything, the new version should be slower since we need an additional (Python scalar) division. Not sure where the measured difference comes from.

[-------------- convert_dtype_image_tensor uint8 -> float32 --------------]
                                    |       main       |  fix-hardcoded-255
1 threads: ----------------------------------------------------------------
      (3, 512, 512), uint8, cpu     |  133 (+-  1) us  |     95 (+-  0) us 
      (5, 3, 512, 512), uint8, cpu  |  640 (+-  3) us  |    438 (+-  6) us 

Times are in microseconds (us).

If anything, the new version should be slower since we need an additional (Python scalar) division. The performance improvement here comes from a trick that I found while implementing this patch that is independent of this change. Instead of doing a tensor division, we can simply do a Python scalar division followed by a tensor multiplication. Effectively, this turns

vision/torchvision/prototype/transforms/functional/_meta.py

Line 370 in 1502ed9

return image.to(dtype).div_(_FT._max_value(image.dtype))

into
```
image.to(dtype).mul_(1 / _FT._max_value(image.dtype)) 
```
Although it seems unrelated here, I've included this improvement in the benchmark, because the only thing that changes is the 1 turns into a _FT._max_value(dtype).

datumbox · 2022-11-03T12:52:01Z

There was some offline discussion whether or not we want to remove the implicit assumption that floating point images have the maximum value 1.0.

Let's not adopt anything that slows us down.

Instead of doing a tensor division, we can simply do a Python scalar division followed by a tensor multiplication.

What the heck! Well sounds good to me. Shall we try the trick in other places too?

Here are some places that we could replace it:

vision/torchvision/prototype/transforms/functional/_color.py

Line 183 in aec38fc

h = h.div_(6.0).add_(1.0).fmod_(1.0)

vision/torchvision/prototype/transforms/functional/_color.py

Line 297 in aec38fc

return image.mul(levels).floor_().clamp_(0, levels - 1).div_(levels)

vision/torchvision/prototype/transforms/functional/_meta.py

Line 370 in aec38fc

return image.to(dtype).div_(_FT._max_value(image.dtype))

Finally from what I understand, none of the changes you make here are expected to cause speed regressions, can you confirm?

pmeier · 2022-11-03T15:52:50Z

Let's not adopt anything that slows us down.

As discussed offline, there are probably a lot more implicit assumptions on the floating point range than what I detailed above. We agreed to just put a comment on the value inside the _FT._max_value function to indicate that this can't be changed easily.

What the heck! Well sounds good to me. Shall we try the trick in other places too?

Will do so in a follow-up PR since this is unrelated to this PR.

Finally from what I understand, none of the changes you make here are expected to cause speed regressions, can you confirm?

Nope, perf should be the same. For some ops I posted benchmarks and they all show either no difference or even an improvement.

…into fix-hardcoded-255

Summary: * fix prototype kernels * fix stable kernels * fix tests * make test more robust * improve invert for signed integers * improve invert * fix posterize * Revert "assume that integer images are [0, 255] in equalize (#6859)" This reverts commit 436ff9a. * fix solarize in AA * fix resize * Revert "fix resize" This reverts commit 5f33f4a. * add comment to float max value Reviewed By: datumbox Differential Revision: D41020539 fbshipit-source-id: 1c618ead36a0ae4d93b4ebf07186fd39bd85d915 Co-authored-by: Vasilis Vryniotis <[email protected]>

pmeier added 2 commits October 24, 2022 19:25

fix prototype kernels

3e81aef

fix stable kernels

33852be

pmeier added module: transforms prototype labels Oct 24, 2022

pmeier requested review from vfdev-5 and datumbox October 24, 2022 17:37

facebook-github-bot added the cla signed label Oct 24, 2022

pmeier commented Oct 24, 2022

View reviewed changes

fix tests

3a92412

pmeier commented Oct 25, 2022

View reviewed changes

pmeier added 2 commits October 25, 2022 09:18

make test more robust

e13613a

Merge branch 'main' into fix-hardcoded-255

a400225

Merge branch 'main' into fix-hardcoded-255

e053125

pmeier and others added 2 commits October 27, 2022 08:49

improve invert for signed integers

3327e04

Merge branch 'main' into fix-hardcoded-255

91e8c66

pmeier mentioned this pull request Oct 28, 2022

extend support of posterize to all integer and floating dtypes #6847

Merged

pmeier added 6 commits October 28, 2022 13:25

Merge branch 'main' into fix-hardcoded-255

bdd8127

improve invert

c672425

fix posterize

6375627

Merge branch 'main' into fix-hardcoded-255

6895f71

Revert "assume that integer images are [0, 255] in equalize (pytorch#…

c0236fc

…6859)" This reverts commit 436ff9a.

Merge branch 'fix-hardcoded-255' of https://github.com/pmeier/vision …

8713528

…into fix-hardcoded-255

datumbox reviewed Oct 28, 2022

View reviewed changes

torchvision/prototype/transforms/functional/_color.py Show resolved Hide resolved

datumbox requested a review from fmassa October 28, 2022 12:14

This was referenced Oct 28, 2022

[NOMERGE] AugMix inplace mul #6861

Closed

[prototype] Adjust solarize threshold on input type #6874

Merged

pmeier added 2 commits November 2, 2022 08:29

Merge branch 'main' into fix-hardcoded-255

9acf2f4

fix solarize in AA

402b01f

pmeier mentioned this pull request Nov 2, 2022

[prototype] Clean up and port the resize kernel in V2 #6892

Merged

pmeier added 2 commits November 3, 2022 10:24

Merge branch 'main' into fix-hardcoded-255

d0394b7

fix resize

5f33f4a

datumbox reviewed Nov 3, 2022

View reviewed changes

torchvision/prototype/transforms/functional/_geometry.py Outdated Show resolved Hide resolved

torchvision/transforms/functional_tensor.py Outdated Show resolved Hide resolved

pmeier added 3 commits November 3, 2022 12:27

Revert "fix resize"

3a13a08

This reverts commit 5f33f4a.

Merge branch 'main' into fix-hardcoded-255

7765a47

Merge branch 'main' into fix-hardcoded-255

2d0549d

datumbox approved these changes Nov 3, 2022

View reviewed changes

pmeier added 3 commits November 3, 2022 18:09

Merge branch 'main' into fix-hardcoded-255

f594ceb

add comment to float max value

48603b0

Merge branch 'fix-hardcoded-255' of https://github.com/pmeier/vision …

a61d44f

…into fix-hardcoded-255

pmeier merged commit cb4413a into pytorch:main Nov 3, 2022

pmeier deleted the fix-hardcoded-255 branch November 3, 2022 17:11

pmeier mentioned this pull request Nov 3, 2022

replace tensor division with scalar division and tensor multiplication #6903

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix hardcoded 255 #6830

Fix hardcoded 255 #6830

pmeier commented Oct 24, 2022 •

edited by pytorch-bot bot

Loading

pmeier Oct 24, 2022

pmeier Oct 24, 2022

pmeier Oct 25, 2022

pmeier Oct 25, 2022

datumbox commented Oct 25, 2022

pmeier commented Oct 25, 2022

pmeier commented Oct 25, 2022

datumbox commented Oct 25, 2022

pmeier commented Oct 25, 2022 •

edited

Loading

datumbox commented Oct 27, 2022

datumbox commented Oct 28, 2022

fmassa commented Nov 3, 2022

datumbox commented Nov 3, 2022

datumbox left a comment

datumbox Nov 3, 2022

pmeier Nov 3, 2022

datumbox Nov 3, 2022

pmeier commented Nov 3, 2022

datumbox commented Nov 3, 2022

pmeier commented Nov 3, 2022

		@@ -15,12 +15,6 @@ def _assert_image_tensor(img: Tensor) -> None:
		raise TypeError("Tensor is not a torch image.")


		def _assert_threshold(img: Tensor, threshold: float) -> None:

Fix hardcoded 255 #6830

Fix hardcoded 255 #6830

Conversation

pmeier commented Oct 24, 2022 • edited by pytorch-bot bot Loading

pmeier Oct 24, 2022

Choose a reason for hiding this comment

pmeier Oct 24, 2022

Choose a reason for hiding this comment

pmeier Oct 25, 2022

Choose a reason for hiding this comment

pmeier Oct 25, 2022

Choose a reason for hiding this comment

datumbox commented Oct 25, 2022

pmeier commented Oct 25, 2022

pmeier commented Oct 25, 2022

datumbox commented Oct 25, 2022

pmeier commented Oct 25, 2022 • edited Loading

datumbox commented Oct 27, 2022

datumbox commented Oct 28, 2022

fmassa commented Nov 3, 2022

datumbox commented Nov 3, 2022

datumbox left a comment

Choose a reason for hiding this comment

datumbox Nov 3, 2022

Choose a reason for hiding this comment

pmeier Nov 3, 2022

Choose a reason for hiding this comment

datumbox Nov 3, 2022

Choose a reason for hiding this comment

pmeier commented Nov 3, 2022

datumbox commented Nov 3, 2022

pmeier commented Nov 3, 2022

pmeier commented Oct 24, 2022 •

edited by pytorch-bot bot

Loading

pmeier commented Oct 25, 2022 •

edited

Loading