Skip to content

[pytorch][mobile] turn off autograd mode in android JNI wrapper #26477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ljk53
Copy link
Contributor

@ljk53 ljk53 commented Sep 19, 2019

Stack from ghstack:

Summary:

  • At inference time we need turn off autograd mode and turn on no-variable
    mode since we strip out these modules for inference-only mobile build.
  • Both flags are stored in thread-local variables so we cannot simply
    set them to false glboally.
  • Add "autograd/grad_mode.h" header to all-in-one header 'torch/script.h'
    to reduce friction for iOS engs who might need do this manually in their
    project.

P.S. I tried to hide AutoNonVariableTypeMode in codegen but figured it's not
very trivial (e.g. there are manually written part not covered by codegen).
Might try it again later.

Test Plan:

  • Integrate with Android demo app to confirm inference runs correctly.

Differential Revision: D17484259

Summary:
- At inference time we need turn off autograd mode and turn on no-variable
  mode since we strip out these modules for inference-only mobile build.
- Both flags are stored in thread-local variables so we cannot simply
  set them to false glboally.
- Add "autograd/grad_mode.h" header to all-in-one header 'torch/script.h'
  to reduce friction for iOS engs who might need do this manually in their
  project.

P.S. I tried to hide AutoNonVariableTypeMode in codegen but figured it's not
very trivial (e.g. there are manually written part not covered by codegen).
Might try it again later.

Test Plan:
- Integrate with Android demo app to confirm inference runs correctly.

[ghstack-poisoned]
@pytorchbot pytorchbot added oncall: jit Add this issue/PR to JIT oncall triage queue module: android Related to Android support labels Sep 19, 2019
ljk53 added a commit that referenced this pull request Sep 19, 2019
Summary:
- At inference time we need turn off autograd mode and turn on no-variable
  mode since we strip out these modules for inference-only mobile build.
- Both flags are stored in thread-local variables so we cannot simply
  set them to false glboally.
- Add "autograd/grad_mode.h" header to all-in-one header 'torch/script.h'
  to reduce friction for iOS engs who might need do this manually in their
  project.

P.S. I tried to hide AutoNonVariableTypeMode in codegen but figured it's not
very trivial (e.g. there are manually written part not covered by codegen).
Might try it again later.

Test Plan:
- Integrate with Android demo app to confirm inference runs correctly.

ghstack-source-id: 94e1d54
Pull Request resolved: #26477
@facebook-github-bot
Copy link
Contributor

@ljk53 merged this pull request in 956b708.

ljk53 added a commit that referenced this pull request Sep 26, 2019
Summary:
- This PR attempts to address issue #26764 (`Issue 1` mentioned below).

- Current flow without USE_STATIC_DISPATCH (for server build):
```
S1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding VariableType methods.
    b.3 VariableType method uses `AutoNonVariableTypeMode` guard before calling into ATen implementation, as ATen generally expects `CHECK(!is_variable())`.
    b.4 VariableType method uses `as_variable` to wrap the results.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after S1.a / S1.b.

S2. module::forward()
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 - a.4: same as S1.b.1 - S1.b.4.

  x. Different from S1.x, seems JIT doesn't expect `CHECK(is_variable())` during the entire `forward()` call.
```

- Current flow with USE_STATIC_DISPATCH (for mobile build):
```
M1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding ATen implementation directly.
      // Issue 1: NO VariableType methods / `AutoNonVariableTypeMode` so `CHECK(!is_variable())` in ATen will fail!
      // (Hypothetical) Issue 2: NO `as_variable()` to wrap result as variable. M1.x will fail if is ever used to check this result.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after M1.a / M1.b.

M2. module::forward() // PR #26477 wraps this call with `AutoNonVariableTypeMode` guard.
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 same as M1.b.1, calls into register_aten_ops_xxx.cpp.
    a.2 same as M1.b.2, calls ATen implementation directly.
      // `CHECK(!is_variable())` in ATen won't fail thanks to the outer scope `AutoNonVariableTypeMode` guard.

  x. Same as above, seems JIT never expects `CHECK(is_variable())` during the entire `forward()` call.
```

- Wrong solution: if we wrap M1 with `AutoNonVariableTypeMode`, it will solve `Issue 1` for some models but will fail M1.x for some other models.

- Proposed solution:
I feel the root cause is that mobile build doesn't have `VariableType` as a barrier sitting between JIT and ATen to convert between is_variable() and !is_variable().
Without `VariableType` the best alternative place to put a barrier is M2.a.1 - `register_aten_ops_xxx.cpp`.
So this PR adds `AutoNonVariableTypeMode` guard to register_aten_ops_xxx.cpp (for ops registered via codegen) and register_c10_ops.cpp (for ops registered via new C10 register API).
This PR doesn't try to address (Hypothetical) Issue 2 as I haven't seen it.
PR #26477 can be replaced by this PR but we can keep it until M2.x is no longer true.

- Ultimate solution:
After Variable and Tensor are completely merged: #23032 then is_variable() checks can be changed to requires_grad() checks and all problems will be solved. We can clean up these hacks by then.

- References:
* Effect of `AutoNonVariableTypeMode`: all `is_variable()` inside current thread scope returns false:
  https://github.com/pytorch/pytorch/blob/master/c10/core/TensorImpl.h#L811

* Effect of `as_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/VariableTypeUtils.h#L159
  It calls `make_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.h#L539

Test Plan:
- Load and run MobileNetV2 fp32 & int8 models.

[ghstack-poisoned]
ljk53 added a commit that referenced this pull request Sep 26, 2019
Summary:
- This PR attempts to address issue #26764 (`Issue 1` mentioned below).

- Current flow without USE_STATIC_DISPATCH (for server build):
```
S1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding VariableType methods.
    b.3 VariableType method uses `AutoNonVariableTypeMode` guard before calling into ATen implementation, as ATen generally expects `CHECK(!is_variable())`.
    b.4 VariableType method uses `as_variable` to wrap the results.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after S1.a / S1.b.

S2. module::forward()
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 - a.4: same as S1.b.1 - S1.b.4.

  x. Different from S1.x, seems JIT doesn't expect `CHECK(is_variable())` during the entire `forward()` call.
```

- Current flow with USE_STATIC_DISPATCH (for mobile build):
```
M1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding ATen implementation directly.
      // Issue 1: NO VariableType methods / `AutoNonVariableTypeMode` so `CHECK(!is_variable())` in ATen will fail!
      // (Hypothetical) Issue 2: NO `as_variable()` to wrap result as variable. M1.x will fail if is ever used to check this result.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after M1.a / M1.b.

M2. module::forward() // PR #26477 wraps this call with `AutoNonVariableTypeMode` guard.
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 same as M1.b.1, calls into register_aten_ops_xxx.cpp.
    a.2 same as M1.b.2, calls ATen implementation directly.
      // `CHECK(!is_variable())` in ATen won't fail thanks to the outer scope `AutoNonVariableTypeMode` guard.

  x. Same as above, seems JIT never expects `CHECK(is_variable())` during the entire `forward()` call.
```

- Wrong solution: if we wrap M1 with `AutoNonVariableTypeMode`, it will solve `Issue 1` for some models but will fail M1.x for some other models.

- Proposed solution:
I feel the root cause is that mobile build doesn't have `VariableType` as a barrier sitting between JIT and ATen to convert between is_variable() and !is_variable().
Without `VariableType` the best alternative place to put a barrier is M2.a.1 - `register_aten_ops_xxx.cpp`.
So this PR adds `AutoNonVariableTypeMode` guard to register_aten_ops_xxx.cpp (for ops registered via codegen) and register_c10_ops.cpp (for ops registered via new C10 register API).
This PR doesn't try to address (Hypothetical) Issue 2 as I haven't seen it.
PR #26477 can be replaced by this PR but we can keep it until M2.x is no longer true.

- Ultimate solution:
After Variable and Tensor are completely merged: #23032 then is_variable() checks can be changed to requires_grad() checks and all problems will be solved. We can clean up these hacks by then.

- References:
* Effect of `AutoNonVariableTypeMode`: all `is_variable()` inside current thread scope returns false:
  https://github.com/pytorch/pytorch/blob/master/c10/core/TensorImpl.h#L811

* Effect of `as_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/VariableTypeUtils.h#L159
  It calls `make_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.h#L539

Test Plan:
- Load and run MobileNetV2 fp32 & int8 models.

ghstack-source-id: a9e5b2d
Pull Request resolved: #26868
ljk53 added a commit that referenced this pull request Sep 27, 2019
…oundary"

Summary:
- This PR together with #26908 attempt to address issue #26764 (`Issue 1` mentioned below).

- Current flow without USE_STATIC_DISPATCH (for server build):
```
S1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding VariableType methods.
    b.3 VariableType method uses `AutoNonVariableTypeMode` guard before calling into ATen implementation, as ATen generally expects `CHECK(!is_variable())`.
    b.4 VariableType method uses `as_variable` to wrap the results.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after S1.a / S1.b.

S2. module::forward()
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 - a.4: same as S1.b.1 - S1.b.4.

  x. Different from S1.x, seems JIT doesn't expect `CHECK(is_variable())` during the entire `forward()` call.
```

- Current flow with USE_STATIC_DISPATCH (for mobile build):
```
M1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding ATen implementation directly.
      // Issue 1: NO VariableType methods / `AutoNonVariableTypeMode` so `CHECK(!is_variable())` in ATen will fail!
      // (Hypothetical) Issue 2: NO `as_variable()` to wrap result as variable. M1.x will fail if is ever used to check this result.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after M1.a / M1.b.

M2. module::forward() // PR #26477 wraps this call with `AutoNonVariableTypeMode` guard.
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 same as M1.b.1, calls into register_aten_ops_xxx.cpp.
    a.2 same as M1.b.2, calls ATen implementation directly.
      // `CHECK(!is_variable())` in ATen won't fail thanks to the outer scope `AutoNonVariableTypeMode` guard.

  x. Same as above, seems JIT never expects `CHECK(is_variable())` during the entire `forward()` call.
```

- Wrong solution: if we wrap M1 with `AutoNonVariableTypeMode`, it will solve `Issue 1` for some models but will fail M1.x for some other models.

- Proposed solution:
I feel the root cause is that mobile build doesn't have `VariableType` as a barrier sitting between JIT and ATen to convert between is_variable() and !is_variable().

Without `VariableType` the best alternative place to put a barrier is M1.b.2 as Edward did in #26908.

For some reason we also need toggle variable state for c10 ops: this is what this PR does. We haven't figured how non-mobile build works without this logic so it's kinda bandaid for now.

This PR doesn't try to address (Hypothetical) Issue 2 as I haven't seen it. PR #26477 can be replaced by #26908 + this PR but we can keep it until M2.x is no longer true.

- Ultimate solution:
After Variable and Tensor are completely merged: #23032 then is_variable() checks can be changed to requires_grad() checks and all problems will be solved. We can clean up these hacks by then.

- References:
* Effect of `AutoNonVariableTypeMode`: all `is_variable()` inside current thread scope returns false:
  https://github.com/pytorch/pytorch/blob/master/c10/core/TensorImpl.h#L811

* Effect of `as_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/VariableTypeUtils.h#L159
  It calls `make_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.h#L539

Test Plan:
- Patch PR #26908 then load and run MobileNetV2 fp32 & int8 models.

Differential Revision: [D17595179](https://our.internmc.facebook.com/intern/diff/D17595179)

[ghstack-poisoned]
ljk53 added a commit that referenced this pull request Sep 27, 2019
Summary:
- This PR together with #26908 attempt to address issue #26764 (`Issue 1` mentioned below).

- Current flow without USE_STATIC_DISPATCH (for server build):
```
S1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding VariableType methods.
    b.3 VariableType method uses `AutoNonVariableTypeMode` guard before calling into ATen implementation, as ATen generally expects `CHECK(!is_variable())`.
    b.4 VariableType method uses `as_variable` to wrap the results.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after S1.a / S1.b.

S2. module::forward()
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 - a.4: same as S1.b.1 - S1.b.4.

  x. Different from S1.x, seems JIT doesn't expect `CHECK(is_variable())` during the entire `forward()` call.
```

- Current flow with USE_STATIC_DISPATCH (for mobile build):
```
M1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding ATen implementation directly.
      // Issue 1: NO VariableType methods / `AutoNonVariableTypeMode` so `CHECK(!is_variable())` in ATen will fail!
      // (Hypothetical) Issue 2: NO `as_variable()` to wrap result as variable. M1.x will fail if is ever used to check this result.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after M1.a / M1.b.

M2. module::forward() // PR #26477 wraps this call with `AutoNonVariableTypeMode` guard.
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 same as M1.b.1, calls into register_aten_ops_xxx.cpp.
    a.2 same as M1.b.2, calls ATen implementation directly.
      // `CHECK(!is_variable())` in ATen won't fail thanks to the outer scope `AutoNonVariableTypeMode` guard.

  x. Same as above, seems JIT never expects `CHECK(is_variable())` during the entire `forward()` call.
```

- Wrong solution: if we wrap M1 with `AutoNonVariableTypeMode`, it will solve `Issue 1` for some models but will fail M1.x for some other models.

- Proposed solution:
I feel the root cause is that mobile build doesn't have `VariableType` as a barrier sitting between JIT and ATen to convert between is_variable() and !is_variable().

Without `VariableType` the best alternative place to put a barrier is M1.b.2 as Edward did in #26908.

For some reason we also need toggle variable state for c10 ops: this is what this PR does. We haven't figured how non-mobile build works without this logic so it's kinda bandaid for now.

This PR doesn't try to address (Hypothetical) Issue 2 as I haven't seen it. PR #26477 can be replaced by #26908 + this PR but we can keep it until M2.x is no longer true.

- Ultimate solution:
After Variable and Tensor are completely merged: #23032 then is_variable() checks can be changed to requires_grad() checks and all problems will be solved. We can clean up these hacks by then.

- References:
* Effect of `AutoNonVariableTypeMode`: all `is_variable()` inside current thread scope returns false:
  https://github.com/pytorch/pytorch/blob/master/c10/core/TensorImpl.h#L811

* Effect of `as_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/VariableTypeUtils.h#L159
  It calls `make_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.h#L539

Test Plan:
- Patch PR #26908 then load and run MobileNetV2 fp32 & int8 models.

ghstack-source-id: e2e50bb
Pull Request resolved: #26868
facebook-github-bot pushed a commit that referenced this pull request Sep 27, 2019
Summary:
- This PR together with #26908 attempt to address issue #26764 (`Issue 1` mentioned below).

- Current flow without USE_STATIC_DISPATCH (for server build):
```
S1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding VariableType methods.
    b.3 VariableType method uses `AutoNonVariableTypeMode` guard before calling into ATen implementation, as ATen generally expects `CHECK(!is_variable())`.
    b.4 VariableType method uses `as_variable` to wrap the results.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after S1.a / S1.b.

S2. module::forward()
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 - a.4: same as S1.b.1 - S1.b.4.

  x. Different from S1.x, seems JIT doesn't expect `CHECK(is_variable())` during the entire `forward()` call.
```

- Current flow with USE_STATIC_DISPATCH (for mobile build):
```
M1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding ATen implementation directly.
      // Issue 1: NO VariableType methods / `AutoNonVariableTypeMode` so `CHECK(!is_variable())` in ATen will fail!
      // (Hypothetical) Issue 2: NO `as_variable()` to wrap result as variable. M1.x will fail if is ever used to check this result.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after M1.a / M1.b.

M2. module::forward() // PR #26477 wraps this call with `AutoNonVariableTypeMode` guard.
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 same as M1.b.1, calls into register_aten_ops_xxx.cpp.
    a.2 same as M1.b.2, calls ATen implementation directly.
      // `CHECK(!is_variable())` in ATen won't fail thanks to the outer scope `AutoNonVariableTypeMode` guard.

  x. Same as above, seems JIT never expects `CHECK(is_variable())` during the entire `forward()` call.
```

- Wrong solution: if we wrap M1 with `AutoNonVariableTypeMode`, it will solve `Issue 1` for some models but will fail M1.x for some other models.

- Proposed solution:
I feel the root cause is that mobile build doesn't have `VariableType` as a barrier sitting between JIT and ATen to convert between is_variable() and !is_variable().

Without `VariableType` the best alternative place to put a barrier is M1.b.2 as Edward did in #26908.

For some reason we also need toggle variable state for c10 ops: this is what this PR does. We haven't figured how non-mobile build works without this logic so it's kinda bandaid for now.

This PR doesn't try to address (Hypothetical) Issue 2 as I haven't seen it. PR #26477 can be replaced by #26908 + this PR but we can keep it until M2.x is no longer true.

- Ultimate solution:
After Variable and Tensor are completely merged: #23032 then is_variable() checks can be changed to requires_grad() checks and all problems will be solved. We can clean up these hacks by then.

- References:
* Effect of `AutoNonVariableTypeMode`: all `is_variable()` inside current thread scope returns false:
  https://github.com/pytorch/pytorch/blob/master/c10/core/TensorImpl.h#L811

* Effect of `as_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/VariableTypeUtils.h#L159
  It calls `make_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.h#L539

Test Plan: - Load and run MobileNetV2 fp32 & int8 models.

Differential Revision: D17595179

Pulled By: ljk53

fbshipit-source-id: ed417ba6b696d722ea04fe18adf6b38ababa6b7c
@facebook-github-bot facebook-github-bot deleted the gh/ljk53/50/head branch October 28, 2019 22:16
pdlive215 pushed a commit to pdlive215/pytorch that referenced this pull request Nov 27, 2019
Summary:
- This PR together with pytorch#26908 attempt to address issue pytorch#26764 (`Issue 1` mentioned below).

- Current flow without USE_STATIC_DISPATCH (for server build):
```
S1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding VariableType methods.
    b.3 VariableType method uses `AutoNonVariableTypeMode` guard before calling into ATen implementation, as ATen generally expects `CHECK(!is_variable())`.
    b.4 VariableType method uses `as_variable` to wrap the results.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after S1.a / S1.b.

S2. module::forward()
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 - a.4: same as S1.b.1 - S1.b.4.

  x. Different from S1.x, seems JIT doesn't expect `CHECK(is_variable())` during the entire `forward()` call.
```

- Current flow with USE_STATIC_DISPATCH (for mobile build):
```
M1. jit::load()
  a. JIT calls variable_factories.h methods to instantiate variable instances.

  b. JIT calls some ATen methods during intitalization, e.g.: conv_prepack, q_scale.
    b.1 First runs corresponding `Operation` in generated register_aten_ops_xxx.cpp, which calls `at::` functions, then calls ATen dispatcher.
    b.2 ATen dispatcher dispatches to corresponding ATen implementation directly.
      // Issue 1: NO VariableType methods / `AutoNonVariableTypeMode` so `CHECK(!is_variable())` in ATen will fail!
      // (Hypothetical) Issue 2: NO `as_variable()` to wrap result as variable. M1.x will fail if is ever used to check this result.

  x. Somewhere in JIT it expects `CHECK(is_variable())` - not sure before/after M1.a / M1.b.

M2. module::forward() // PR pytorch#26477 wraps this call with `AutoNonVariableTypeMode` guard.
  a. JIT interpreter calls some ATen methods (via JIT registry).
    a.1 same as M1.b.1, calls into register_aten_ops_xxx.cpp.
    a.2 same as M1.b.2, calls ATen implementation directly.
      // `CHECK(!is_variable())` in ATen won't fail thanks to the outer scope `AutoNonVariableTypeMode` guard.

  x. Same as above, seems JIT never expects `CHECK(is_variable())` during the entire `forward()` call.
```

- Wrong solution: if we wrap M1 with `AutoNonVariableTypeMode`, it will solve `Issue 1` for some models but will fail M1.x for some other models.

- Proposed solution:
I feel the root cause is that mobile build doesn't have `VariableType` as a barrier sitting between JIT and ATen to convert between is_variable() and !is_variable().

Without `VariableType` the best alternative place to put a barrier is M1.b.2 as Edward did in pytorch#26908.

For some reason we also need toggle variable state for c10 ops: this is what this PR does. We haven't figured how non-mobile build works without this logic so it's kinda bandaid for now.

This PR doesn't try to address (Hypothetical) Issue 2 as I haven't seen it. PR pytorch#26477 can be replaced by pytorch#26908 + this PR but we can keep it until M2.x is no longer true.

- Ultimate solution:
After Variable and Tensor are completely merged: pytorch#23032 then is_variable() checks can be changed to requires_grad() checks and all problems will be solved. We can clean up these hacks by then.

- References:
* Effect of `AutoNonVariableTypeMode`: all `is_variable()` inside current thread scope returns false:
  https://github.com/pytorch/pytorch/blob/master/c10/core/TensorImpl.h#L811

* Effect of `as_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/VariableTypeUtils.h#L159
  It calls `make_variable`: https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.h#L539

Test Plan: - Load and run MobileNetV2 fp32 & int8 models.

Differential Revision: D17595179

Pulled By: ljk53

fbshipit-source-id: ed417ba6b696d722ea04fe18adf6b38ababa6b7c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merged module: android Related to Android support oncall: jit Add this issue/PR to JIT oncall triage queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants