[GraphOptimizer] Fix wrong layout assumption in OptimizeReduceMean? #3499

shajrawi · 2019-09-10T23:28:03Z

I see the following code in the optimization:

      // In Glow, AvgPool expects NHWC.
      auto *TR1 = F->createTranspose(
          RM->getName().str() + ".transposeNCHW2NHWC", in, NCHW2NHWC);
      auto *AP = F->createAvgPool(RM->getName().str() + ".avgPool", TR1,
                                  kernels, strides, pads);
      auto *TR2 = F->createTranspose(
          RM->getName().str() + ".transposeNHWC2NCHW", AP, NHWC2NCHW);

This looks like a bug to me, the canonical representation in Glow is NHWC. Why do we expect the input to be in NCHW format and doing the two transposes?

The text was updated successfully, but these errors were encountered:

shajrawi · 2019-09-10T23:28:30Z

cc @vuzelac-cadence

Fixes pytorch#3452 Also Fixes pytorch#3493 and pytorch#3500 GraphOptimizer bugs which were found after adding the layout verifier. Provides a workaround for the pytorch#3499 issue which was also found via the verifier. Note: I did not want to break the `enum ConvolutionLayout` introduced in 5074a72, As such, I used it in the verifier / did not change the creation of said nodes. HOWEVER: We should use the more-generic string-based layout, which I introduce to Transpose node in this commit: it is basically an extendable enum that can be used in the backends without touching the generic code base. as a bonus, it makes differentiation easier: see how it is done for transpose now in `Function *glow::differentiate`. Getting rid of said enum is a proposed TODO / follow-up. Also note that some nodes *need* layout requirements, which have been added, namely we need to know the layout for placeholders and constants (obviously) and for reshapes (in case we optimized a transpose into a reshape. An additional nice-to-have feature of the string-based layout is the wildcard / any-layout option. Some operations, such as data parallel nodes, might accept any layout. A potential follow-up is to get create a "Solver" that automatically inserts transposes if the layouts do not match, this might greatly simplify the loader: we no longer need to insert transposes based on if we are importing NHWC or NCHW (for example). We just need to annotate the placeholder with the layout information we've get at load-time, and which we "forget" afterwards. The verifier is useful even without creating said solver, it exposed a couple of bugs which are mentioned in this commit, as such any proposed solvers are not a must-have to demonstrate the usefulness of this commit.

…#3503) Summary: Note: I did not want to break the `enum ConvolutionLayout` introduced in 5074a72, As such, I used it in the verifier / did not change the creation of said nodes. HOWEVER: We should use the more-generic string-based layout, which I introduce to Transpose node in this commit: it is basically an extendable enum that can be used in the backends without touching the generic code base. as a bonus, it makes differentiation easier: see how it is done for transpose now in `Function *glow::differentiate`. Getting rid of said enum is a proposed TODO / follow-up. Also note that some nodes *need* layout requirements, which have been added, namely we need to know the layout for placeholders and constants (obviously) and for reshapes (in case we optimized a transpose into a reshape). An additional nice-to-have feature of the string-based layout is the wildcard / any-layout option. Some operations, such as data parallel nodes, might accept any layout. A potential follow-up is to get create a "Solver" that automatically inserts transposes if the layouts do not match, this might greatly simplify the loader: we no longer need to insert transposes based on if we are importing NHWC or NCHW (for example). We just need to annotate the placeholder with the layout information we've get at load-time, and which we "forget" afterwards. The verifier is useful even without creating said solver, it exposed a couple of bugs which are mentioned in this commit, as such any proposed solvers are not a must-have to demonstrate the usefulness of this commit. Fixes #3452 Also Fixes #3493 and Fixes #3500 GraphOptimizer bugs which were found after adding the layout verifier. Provides a workaround for the #3499 issue which was also found via the verifier. Pull Request resolved: #3503 Test Plan: `ninja test` Differential Revision: D18357369 Pulled By: shajrawi fbshipit-source-id: 45f91fbe120b234c2a85879cee9ee0de6c100b50

…pytorch#3503) Summary: Note: I did not want to break the `enum ConvolutionLayout` introduced in 5074a72, As such, I used it in the verifier / did not change the creation of said nodes. HOWEVER: We should use the more-generic string-based layout, which I introduce to Transpose node in this commit: it is basically an extendable enum that can be used in the backends without touching the generic code base. as a bonus, it makes differentiation easier: see how it is done for transpose now in `Function *glow::differentiate`. Getting rid of said enum is a proposed TODO / follow-up. Also note that some nodes *need* layout requirements, which have been added, namely we need to know the layout for placeholders and constants (obviously) and for reshapes (in case we optimized a transpose into a reshape). An additional nice-to-have feature of the string-based layout is the wildcard / any-layout option. Some operations, such as data parallel nodes, might accept any layout. A potential follow-up is to get create a "Solver" that automatically inserts transposes if the layouts do not match, this might greatly simplify the loader: we no longer need to insert transposes based on if we are importing NHWC or NCHW (for example). We just need to annotate the placeholder with the layout information we've get at load-time, and which we "forget" afterwards. The verifier is useful even without creating said solver, it exposed a couple of bugs which are mentioned in this commit, as such any proposed solvers are not a must-have to demonstrate the usefulness of this commit. Fixes pytorch#3452 Also Fixes pytorch#3493 and Fixes pytorch#3500 GraphOptimizer bugs which were found after adding the layout verifier. Provides a workaround for the pytorch#3499 issue which was also found via the verifier. Pull Request resolved: pytorch#3503 Test Plan: `ninja test` Differential Revision: D18357369 Pulled By: shajrawi fbshipit-source-id: 45f91fbe120b234c2a85879cee9ee0de6c100b50

dspmihai · 2021-08-04T13:35:50Z

Will this code be ever executed? I see in lowerBatchedReduceMeanNode that ReduceMean can only have one axis. @jfix71, @mciprian13 , any idea how it's better to approach this again? I am thinking of lowering ReduceMean to AvgPool in Lower.cpp, if possible. Or maybe lower to multiple ReduceMean, then find the pattern in OptimizeReduceMean (possible error-prone after future optimizations)?

vuzelac-cadence · 2021-08-05T03:31:50Z

@dspmihai, lowering is backend controlled. For example, we do use this optimization.

shajrawi assigned shajrawi and unassigned shajrawi Sep 10, 2019

shajrawi mentioned this issue Sep 11, 2019

Teach Glow how to support layout requirements and fix uncovered bugs. #3503

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GraphOptimizer] Fix wrong layout assumption in OptimizeReduceMean? #3499

[GraphOptimizer] Fix wrong layout assumption in OptimizeReduceMean? #3499

shajrawi commented Sep 10, 2019

shajrawi commented Sep 10, 2019

Uh oh!

dspmihai commented Aug 4, 2021

Uh oh!

vuzelac-cadence commented Aug 5, 2021

Uh oh!

[GraphOptimizer] Fix wrong layout assumption in OptimizeReduceMean? #3499

[GraphOptimizer] Fix wrong layout assumption in OptimizeReduceMean? #3499

Comments

shajrawi commented Sep 10, 2019

shajrawi commented Sep 10, 2019

Uh oh!

dspmihai commented Aug 4, 2021

Uh oh!

vuzelac-cadence commented Aug 5, 2021

Uh oh!