Skip to content

[prototype] Speed up adjust_contrast_image_tensor #6933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 9, 2022

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Nov 9, 2022

Related to #6818

Small performance improvement for uint8 images by avoiding the double casting from floats to ints and back to floats.

Floats remain unaffected, a 5% boost on uint8s:

[----------- adjust_contrast_image_tensor cpu torch.float32 -----------]
                         |  adjust_contrast_image_tensor old  |  fn2 new
1 threads: -------------------------------------------------------------
      (16, 3, 400, 400)  |               13600                |   13500 
      (3, 400, 400)      |                 538                |     535 
6 threads: -------------------------------------------------------------
      (16, 3, 400, 400)  |               14300                |   14300 
      (3, 400, 400)      |                 838                |     834 

Times are in microseconds (us).

[---------- adjust_contrast_image_tensor cuda torch.float32 -----------]
                         |  adjust_contrast_image_tensor old  |  fn2 new
1 threads: -------------------------------------------------------------
      (16, 3, 400, 400)  |                196                 |    196  
      (3, 400, 400)      |                 64                 |     62  
6 threads: -------------------------------------------------------------
      (16, 3, 400, 400)  |                197                 |    197  
      (3, 400, 400)      |                 64                 |     62  

Times are in microseconds (us).

[------------ adjust_contrast_image_tensor cpu torch.uint8 ------------]
                         |  adjust_contrast_image_tensor old  |  fn2 new
1 threads: -------------------------------------------------------------
      (16, 3, 400, 400)  |               26100                |   25000 
      (3, 400, 400)      |                1109                |     999 
6 threads: -------------------------------------------------------------
      (16, 3, 400, 400)  |               28300                |   30000 
      (3, 400, 400)      |                1700                |    1550 

Times are in microseconds (us).

[----------- adjust_contrast_image_tensor cuda torch.uint8 ------------]
                         |  adjust_contrast_image_tensor old  |  fn2 new
1 threads: -------------------------------------------------------------
      (16, 3, 400, 400)  |               257.3                |    242  
      (3, 400, 400)      |                89.7                |     79  
6 threads: -------------------------------------------------------------
      (16, 3, 400, 400)  |               258.1                |    242  
      (3, 400, 400)      |                90.2                |     80  

Times are in microseconds (us).

cc @vfdev-5 @bjuncek @pmeier

if c == 3:
grayscale_image = _rgb_to_gray(image, cast=False)
if not fp:
grayscale_image = grayscale_image.floor_()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmeier In an early iteration of the PR, I have missed this floor_() call which is necessary to reproduce identical results with stable. Unfortunately the tests passed without catching the issue. Might be worth checking that our reference tests check both on floats and ints.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is properly tested with

def reference_inputs_adjust_contrast_image_tensor():
for image_loader, contrast_factor in itertools.product(
make_image_loaders(color_spaces=(features.ColorSpace.GRAY, features.ColorSpace.RGB), extra_dims=[()]),
_ADJUST_CONTRAST_FACTORS,
):
yield ArgsKwargs(image_loader, contrast_factor=contrast_factor)

but it seems our tolerances are to high to pick up on this

closeness_kwargs=DEFAULT_PIL_REFERENCE_CLOSENESS_KWARGS,

DEFAULT_PIL_REFERENCE_CLOSENESS_KWARGS = {
(("TestKernels", "test_against_reference"), torch.float32, "cpu"): dict(atol=1e-5, rtol=0, agg_method="mean"),
(("TestKernels", "test_against_reference"), torch.uint8, "cpu"): dict(atol=1e-5, rtol=0, agg_method="mean"),
}

However, we can't remove the tolerances completely here, since the output differs even with the current implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... 99% of the performance optimizations were tested for training and we know they are correct. So there might be rounding error issues. I think we should adjust the tolerances to the degree that will catch BC issues but not go ballistic for the 8th decimal. That might be a hard exercise though.

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @datumbox

@datumbox
Copy link
Contributor Author

datumbox commented Nov 9, 2022

@vfdev-5 thanks. It seems there is an unrelated flaky test about resizing bboxes:
https://github.com/pytorch/vision/actions/runs/3427555648/jobs/5710704975

Could you have a look?

FAILED test/test_prototype_transforms_functional.py::TestKernels::test_against_reference[resize_bounding_box-55] - AssertionError: Tensor-likes are not equal!

Mismatched elements: 2 / 16 (12.5%)
Greatest absolute difference: 1 at index (2, 2)
Greatest relative difference: 0.25 at index (2, 3)

The failure occurred for item [0]
FAILED test/test_prototype_transforms_functional.py::TestKernels::test_against_reference[resize_bounding_box-56] - AssertionError: Tensor-likes are not equal!

Mismatched elements: 2 / 16 (12.5%)
Greatest absolute difference: 1 at index (2, 2)
Greatest relative difference: 0.25 at index (2, 3)

The failure occurred for item [0]
FAILED test/test_prototype_transforms_functional.py::TestKernels::test_against_reference[resize_bounding_box-57] - AssertionError: Tensor-likes are not equal!

Mismatched elements: 2 / 16 (12.5%)
Greatest absolute difference: 1 at index (2, 2)
Greatest relative difference: 0.25 at index (2, 3)

The failure occurred for item [0]

@datumbox datumbox merged commit 10d47a6 into pytorch:main Nov 9, 2022
@datumbox datumbox deleted the perf/contrast branch November 9, 2022 16:04
facebook-github-bot pushed a commit that referenced this pull request Nov 14, 2022
Summary:
* Avoid double casting on adjust_contrast

* Handle properly ints.

Reviewed By: NicolasHug

Differential Revision: D41265198

fbshipit-source-id: 4bacdc743b3cdf55a43e0ca6185b9c9b2ab12160
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants