[proto] Small improvement for tensor equalize op #6738

vfdev-5 · 2022-10-11T08:58:59Z

Time benchmark: RandomEqualize (1.0,) None
V2: RandomEqualize(p=1.0) torchvision.prototype.transforms._color
Stable: RandomEqualize(p=1.0) torchvision.transforms.transforms

Main:

[- Classification transforms measurements -]
                         |  stable  |    v2
1 threads: ---------------------------------
      Tensor Image data  |  2.875   |  3.184

Times are in milliseconds (ms).

This PR:

[- Classification transforms measurements -]
                         |  stable  |    v2
1 threads: ---------------------------------
      Tensor Image data  |  2.883   |  2.874

Times are in milliseconds (ms).

Here is cprof logs to see number of calls reduction:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
-     6000    3.918    0.001    6.354    0.001 /vision/torchvision/transforms/functional_tensor.py:870(_scale_channel)
-     6000    1.216    0.000    1.216    0.000 {built-in method torch.bincount}
-    12000    0.957    0.000    0.957    0.000 {method 'to' of 'torch._C._TensorBase' objects}
-     4000    0.360    0.000    0.360    0.000 {built-in method torch.stack}
+     6000    3.433    0.001    5.380    0.001 /vision/torchvision/prototype/transforms/functional/_color.py:186(_scale_channel)
+     6000    1.229    0.000    1.229    0.000 {built-in method torch.bincount}
+    12000    0.454    0.000    0.454    0.000 {method 'to' of 'torch._C._TensorBase' objects}
+     2000    0.187    0.000    0.187    0.000 {built-in method torch.stack}

Main (12adc54):

   660002 function calls in 7.247 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     6000    3.918    0.001    6.354    0.001 /vision/torchvision/transforms/functional_tensor.py:870(_scale_channel)
     6000    1.216    0.000    1.216    0.000 {built-in method torch.bincount}
    12000    0.957    0.000    0.957    0.000 {method 'to' of 'torch._C._TensorBase' objects}
     4000    0.360    0.000    0.360    0.000 {built-in method torch.stack}
    18000    0.080    0.000    0.080    0.000 {built-in method torch.div}
     2000    0.069    0.000    6.424    0.003 /vision/torchvision/transforms/functional_tensor.py:892(<listcomp>)
     6000    0.061    0.000    0.061    0.000 {built-in method torch.nn.functional.pad}
    10000    0.043    0.000    0.043    0.000 {method 'view' of 'torch._C._TensorBase' objects}
     2000    0.040    0.000    6.984    0.003 /vision/torchvision/prototype/transforms/_transform.py:66(forward)
     2000    0.038    0.000    0.141    0.000 /usr/lib/python3.8/traceback.py:321(extract)
    10000    0.037    0.000    0.037    0.000 {built-in method posix.stat}
     6000    0.036    0.000    0.036    0.000 {method 'clamp' of 'torch._C._TensorBase' objects}
     6000    0.034    0.000    0.034    0.000 {method 'sum' of 'torch._C._TensorBase' objects}
     6000    0.031    0.000    0.031    0.000 {built-in method torch.cumsum}
     2000    0.030    0.000    0.053    0.000 /usr/lib/python3.8/traceback.py:388(format)
     2000    0.029    0.000    6.642    0.003 /vision/torchvision/transforms/functional_tensor.py:891(_equalize_single_image)
     2000    0.023    0.000    0.023    0.000 {built-in method torch.rand}
     2000    0.017    0.000    6.880    0.003 /vision/torchvision/prototype/transforms/functional/_color.py:186(equalize_image_tensor)
    54000    0.014    0.000    0.033    0.000 /usr/lib/python3.8/traceback.py:285(line)
    36000    0.012    0.000    0.012    0.000 {method 'format' of 'str' objects}

This PR:

   652002 function calls in 6.011 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     6000    3.433    0.001    5.380    0.001 /vision/torchvision/prototype/transforms/functional/_color.py:186(_scale_channel)
     6000    1.229    0.000    1.229    0.000 {built-in method torch.bincount}
    12000    0.454    0.000    0.454    0.000 {method 'to' of 'torch._C._TensorBase' objects}
     2000    0.187    0.000    0.187    0.000 {built-in method torch.stack}
    18000    0.085    0.000    0.085    0.000 {built-in method torch.div}
     6000    0.058    0.000    0.058    0.000 {built-in method torch.nn.functional.pad}
    10000    0.046    0.000    0.046    0.000 {method 'view' of 'torch._C._TensorBase' objects}
     2000    0.040    0.000    5.752    0.003 /vision/torchvision/prototype/transforms/_transform.py:66(forward)
     2000    0.038    0.000    0.138    0.000 /usr/lib/python3.8/traceback.py:321(extract)
     6000    0.036    0.000    0.036    0.000 {method 'sum' of 'torch._C._TensorBase' objects}
    10000    0.036    0.000    0.036    0.000 {built-in method posix.stat}
     6000    0.032    0.000    0.032    0.000 {built-in method torch.cumsum}
     2000    0.031    0.000    0.053    0.000 /usr/lib/python3.8/traceback.py:388(format)
     6000    0.028    0.000    0.028    0.000 {method 'clamp_' of 'torch._C._TensorBase' objects}
     2000    0.022    0.000    0.022    0.000 {built-in method torch.rand}
     2000    0.018    0.000    5.648    0.003 /vision/torchvision/prototype/transforms/functional/_color.py:209(equalize_image_tensor)
     2000    0.015    0.000    5.395    0.003 /vision/torchvision/prototype/transforms/functional/_color.py:223(<listcomp>)
    54000    0.013    0.000    0.033    0.000 /usr/lib/python3.8/traceback.py:285(line)
     2000    0.012    0.000    0.012    0.000 {method 'unbind' of 'torch._C._TensorBase' objects}
    36000    0.012    0.000    0.012    0.000 {method 'format' of 'str' objects}

…im-equalize

datumbox

LGTM, thanks! Feel free to merge on green CI.

datumbox · 2022-10-11T10:38:54Z

torchvision/prototype/transforms/functional/_color.py

+    lut.clamp_(0, 255)
+    lut = lut.to(torch.uint8)


Could you add a comment here explaining what we discussed offline in regards to why moving clamp and to here leads to a faster result?

…im-equalize

vfdev-5 · 2022-10-11T21:00:45Z

There can be more improvement if we vectorize histogram computation with scatter_add_ (cc @lezcano).
As Vasilis is suggesting to go iteratively, let's merge this PR first and put other improvements in follow-up PRs.

Summary: * [proto] Small improvement for tensor equalize op * Fix code formatting * Added a comment on the ops Reviewed By: NicolasHug Differential Revision: D40427464 fbshipit-source-id: f40623c83cebe269717151ae52f1fe9af47a3bde

[proto] Small improvement for tensor equalize op

24eb457

facebook-github-bot added the cla signed label Oct 11, 2022

vfdev-5 added 2 commits October 11, 2022 09:58

Merge branch 'main' of github.com:pytorch/vision into proto-small-opt…

be633a4

…im-equalize

Fix code formatting

77b7990

vfdev-5 marked this pull request as ready for review October 11, 2022 09:59

datumbox approved these changes Oct 11, 2022

View reviewed changes

datumbox added module: transforms Perf For performance improvements prototype labels Oct 11, 2022

vfdev-5 added 2 commits October 11, 2022 20:56

Merge branch 'main' of github.com:pytorch/vision into proto-small-opt…

3754bd2

…im-equalize

Added a comment on the ops

84c5978

vfdev-5 merged commit 11a2eed into pytorch:main Oct 11, 2022

vfdev-5 deleted the proto-small-optim-equalize branch October 11, 2022 21:47

pmeier mentioned this pull request Oct 24, 2022

Performance improvements for transforms v2 vs. v1 #6818

Closed

31 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[proto] Small improvement for tensor equalize op #6738

[proto] Small improvement for tensor equalize op #6738

Uh oh!

vfdev-5 commented Oct 11, 2022 •

edited

Loading

Uh oh!

datumbox left a comment

Uh oh!

datumbox Oct 11, 2022

Uh oh!

vfdev-5 commented Oct 11, 2022

Uh oh!

Uh oh!

[proto] Small improvement for tensor equalize op #6738

[proto] Small improvement for tensor equalize op #6738

Uh oh!

Conversation

vfdev-5 commented Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

datumbox left a comment

Choose a reason for hiding this comment

Uh oh!

datumbox Oct 11, 2022

Choose a reason for hiding this comment

Uh oh!

vfdev-5 commented Oct 11, 2022

Uh oh!

Uh oh!

vfdev-5 commented Oct 11, 2022 •

edited

Loading