Skip to content

Add ROCm5.2.3/AMDGPU support for PyTorch #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1,008 commits into from
Closed

Conversation

WBobby
Copy link
Owner

@WBobby WBobby commented Aug 17, 2022

Fixes #ISSUE_NUMBER

bigfootjon and others added 30 commits April 4, 2022 22:28
Summary:
X-link: pytorch/pytorch-canary#82

This will allow us to enable co-development merges between phabricator and GitHub

Pull Request resolved: pytorch#75226

Reviewed By: malfet, seemethere

Differential Revision: D35375458

Pulled By: bigfootjon

fbshipit-source-id: e25f35e02b404850132c3972744202d27a18d8aa
(cherry picked from commit 957c313)
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: pytorch#75081

Approved by: https://github.com/atalman
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: pytorch#75080

Approved by: https://github.com/atalman
Was noticing longer than average queuing times for linux.2xlarge and
also found out that we're hitting our max limit more often than not so
bumping this to 750 to give us more capacity to play around with.

Signed-off-by: Eli Uriegas <[email protected]>

<details>
<summary> Number of times we've hit this in the last week </summary>

![Screen Shot 2022-04-04 at 3 44 30 PM](https://user-images.githubusercontent.com/1700823/161644454-eda8d3af-2e62-4e66-aea3-13ec37a41d7d.png)

Query: https://fburl.com/6cst46y0

</summary>
Pull Request resolved: pytorch#75234
Approved by: https://github.com/kit1980, https://github.com/osalpekar, https://github.com/malfet
…positional args (pytorch#75146)

Summary:
Pull Request resolved: pytorch#75146

Previously we assume `to` must be called with positioanl args, but this may not be the case,
e.g. we can do `to(dtype=?)` or `to(memory_format=?)`

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: ejguan

Differential Revision: D35342088

fbshipit-source-id: 22bfe78ae84e74141ae6560285c5c38bc068c999
(cherry picked from commit a3593c0)
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: pytorch#75187

Approved by: https://github.com/zou3519
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: pytorch#75083

Approved by: https://github.com/ngimel
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: pytorch#75079

Approved by: https://github.com/albanD
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: pytorch#75082

Approved by: https://github.com/ngimel
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: pytorch#75084

Approved by: https://github.com/ngimel
Summary:
Pull Request resolved: pytorch#75237

applies 'OVRSOURCE' logic to one more place missed in D35331263 (pytorch@8b7e2bf) so that lazy TS backend is not compiled in internal builds

Test Plan: CI

Reviewed By: malfet, shunting314

Differential Revision: D35377758

fbshipit-source-id: 5dcd3d36e50a8917470a917f2120353972dc31ba
(cherry picked from commit 8b8ed7b)
Summary:
Pull Request resolved: pytorch#74845

This PR adds support for quantization flow to detect
parametrized modules and match them using their original module types.
This mainly involved using the new type_before_parametrizations function rather than
type to check for module mathcing

Test Plan:
python test/test_ao_sparsity.py TestComposability

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D35240274

fbshipit-source-id: 7294d89c9c2e069e51d8b9bafa45c15f92bed124
(cherry picked from commit ed5cdb7)
Summary:
Pull Request resolved: pytorch#74560

This PR add support for quantized tensors with "unknown quantizer",
which means that we can use standard APIs like torch.empty to allocate
quantized tensors, with the understanding that we will set the
quantizer later.  This makes meta functions applicable to quantized
tensors (they will allocate with unknown quantizer and the kernel
will set the quantizer later) and fixes a bug David Dang reported
where structured kernels give a weird error message when you call them
with quantized inputs.

This is not a complete support for quantized structured kernels because
I haven't actually tried porting any of the quantized implementations
to structured; qadd is probably a good choice to try first as it
does its broadcasting implementation using TensorIterator.  My goal
here is just to show that the error message is better.

See also pytorch#52680

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D35317441

Pulled By: dzdang

fbshipit-source-id: ffb85b0e06ccbcc2b01052ca6760517684048b39
(cherry picked from commit 2a54b8b)
…ytorch#74878)

Summary:
Pull Request resolved: pytorch#74878

Previously we record the matched node as a list of nodes: `List[Node]`, this does not generalize
to a graph, which is needed for future use cases, in this PR we changed the recorded node as
NodePattern instead, currently defined as
```
NodePattern = Union[Tuple[Node, Node], Tuple[Node, Tuple[Node, Node]], Any]
```
but can be more general.

This will allow us to support more general patterns with backend_config_dict api, and is also needed
for BinaryOpQuantizeHandler refactor

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D35203616

fbshipit-source-id: f4bf5b056cfc0955455eea9c2bf1ac9f6dde3974
(cherry picked from commit b290c04)
…on)"

Summary:
Original commit changeset: 426a07808035

Original Phabricator Diff: D34943147 (pytorch@8d7242a)

Since D34943147 (pytorch@8d7242a) landed, Adfinder push candidates show consistently push blocking red counters for getAds C CPU main thread and getAds NC CPU main thread.

AF auto prod canary for D34943147 (pytorch@8d7242a), c1-c2 does shows 1.19% regression for counter 'getAds C CPU main thread' and ~1% regression for counter 'getAds C CPU main thread': https://www.internalfb.com/intern/experiment_store/experiment/27487791896054/#commit1-commit2
To help unblock adfinder push, reverting D34943147 (pytorch@8d7242a)

Test Plan:
Canary: https://our.intern.facebook.com/intern/ads/canary/442677925633895915
Canary completed: https://www.internalfb.com/intern/experiment_store/experiment/25288768753864/#commit1-commit2
Counter 'getAds C CPU main thread' moves in the opposite direction by -0.75.

Differential Revision: D35370901

fbshipit-source-id: b2e89f5976eb3fa2c2b22f120c0e32e380f5bc52
(cherry picked from commit 1eb14fe)
As pointed by pytorch#71205, `torch.hub.load` assumes that the user trusts the repo from where the code is gathered and exececuted. We propose a solution to make sure that the user is aware of the security threat that this can represent.

**Solution**: Adds a `trust_repo` parameter to the `load`, `list` and `help` functions in torch.hub.
For now, the default `trust_repo=None` warns that, in the future, the user will need to authorize explicitly every repo before downloading it.
Once the repo has been trusted (via `trust_repo=True` or via a command prompt input) it will be added to the list of trusted repositories.

Pull Request resolved: pytorch#72060
Approved by: https://github.com/NicolasHug
Summary:
Pull Request resolved: pytorch#75244

Original commit changeset: d653a5af662a

Original Phabricator Diff: D35060736 (pytorch@d9d3492)

Test Plan: Model loading test, verified that D35060736 (pytorch@d9d3492) will cause the torch::save => torch::load failure.

Reviewed By: yinghai, jianyuh

Differential Revision: D35387009

fbshipit-source-id: 9d176992d402d57779e2af3d905b3c1538335298
(cherry picked from commit 6c8cc0d)
When start_val == 0, using the comparison `start_val > self[dim]` can be folded easily (0 is never strictly greater than the result of `self[dim]`), but `start_val >= self[dim]` can't. Since we assign `start_val = sef[dim]` in the body anyway, both these are equivalent

Pull Request resolved: pytorch#74980
Approved by: https://github.com/eellison
It caused a number of internal only compilation failures, for example
see:
pytorch#74425 (comment)
and pytorch#74542 (comment)

Pull Request resolved: pytorch#75085

Approved by: https://github.com/ngimel, https://github.com/albanD
Summary:
Pull Request resolved: pytorch#74946

Warn instead of hard failure when fail to clone state_dict, as this
param might not be managed by FSDP and thus we do not expect to clone it.
ghstack-source-id: 152978204

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D35242306

fbshipit-source-id: d9eb58a2993341040e4a9f36fa388f423bd2ddc5
(cherry picked from commit 6b0d080)
Hoping to fix regression from https://hud.pytorch.org/minihud#1bcae0d10e1c4eddf07f9e60ced9b4f3c2c04b1f

Adding quantized::softmax to list until 4/15/22.
Pull Request resolved: pytorch#75254
Approved by: https://github.com/albanD
Summary: Pull Request resolved: pytorch#75243

Test Plan: Imported from OSS

Reviewed By: rohan-varma

Differential Revision: D35384883

Pulled By: awgu

fbshipit-source-id: 8dfc12035b79861df093d5921ed7b36050c9f3a0
(cherry picked from commit 6991467)
Updates our s3 actions to upload and download artifacts to versions that
include runAttempt in the prefix for the artifact. This change is mostly
to make it so that subsequent re-runs of a workflow do not attempt to
grab artifacts from previous runs

Coincides with:
* seemethere/upload-artifact-s3#4
* seemethere/download-artifact-s3#1

Signed-off-by: Eli Uriegas <[email protected]>

Pull Request resolved: pytorch#74576
Approved by: https://github.com/malfet, https://github.com/janeyx99
pruthvistony and others added 22 commits April 22, 2022 09:27
Skipped the failing tests on ROCm during IFU-master-2022-04-11
As per pytorch#74995, the tests
needs to be skipped for odd WORLD_SIZE

Signed-off-by: Jagadish Krishnamoorthy <[email protected]>

Fixes pytorch#74995

Pull Request resolved: pytorch#76136
Approved by: https://github.com/kumpera, https://github.com/wayi1
SortImpl.cu needs to include <thrust/execution_policy.h> for
thrust::host.  Depending on the nvidia/thrust or rocThrust version,
transitive inclusion of this header is not guaranteed.
Change the rtol level

Signed-off-by: Jagadish Krishnamoorthy <[email protected]>
…el_test

[ROCm] Disable TestDataParallelDeviceType tests
To protect CI from sudden version updates, that are not compatible with other packages

Fixes pytorch#78362

Pull Request resolved: pytorch#78369
Approved by: https://github.com/suo, https://github.com/atalman
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: pytorch#78804

Approved by: https://github.com/janeyx99
Increase system memory requirement for TestShapeOpsCUDA.test_flip_large_tensor_cuda

Signed-off-by: Jagadish Krishnamoorthy <[email protected]>
* Fix baseurl link in CentOS for ROCm5.2

* Add ROCm5.2.1/AMDGPU support

Co-authored-by: Wang, Yanyao <[email protected]>
@WBobby WBobby closed this Aug 17, 2022
@WBobby
Copy link
Owner Author

WBobby commented Aug 17, 2022

no use it.

WBobby pushed a commit that referenced this pull request Aug 18, 2022
…78136) (pytorch#78204)

This prevents `import torch` accidentally crash on machines with no metal devices

Should prevent crashes reported in pytorch#77662 (comment) and https://github.com/pytorch/functorch/runs/6560056366?check_suite_focus=true

Backtrace to the crash:
```
(lldb) bt
* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff7202be57 libobjc.A.dylib`objc_msgSend + 23
    frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436
    frame #2: 0x000000010fda011d libtorch_cpu.dylib`_GLOBAL__sub_I_MPSAllocator.mm + 125
    frame ROCm#3: 0x000000010ada81e3 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 535
    frame ROCm#4: 0x000000010ada85ee dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40(lldb) up
frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436
libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl:
->  0x10fd9f524 <+436>: movq   %rax, 0x1b0(%rbx)
    0x10fd9f52b <+443>: movw   $0x0, 0x1b8(%rbx)
    0x10fd9f534 <+452>: addq   $0x8, %rsp
    0x10fd9f538 <+456>: popq   %rbx
(lldb) disassemble
 ...
    0x10fd9f514 <+420>: movq   0xf19ad15(%rip), %rsi     ; "maxBufferLength"
    0x10fd9f51b <+427>: movq   %r14, %rdi
    0x10fd9f51e <+430>: callq  *0xeaa326c(%rip)          ; (void *)0x00007fff7202be40: objc_msgSend
```

which corresponds to `[m_device maxBufferLength]` call, where `m_device` is not initialized in
https://github.com/pytorch/pytorch/blob/2ae3c59e4bcb8e6e75b4a942cacc2d338c88e609/aten/src/ATen/mps/MPSAllocator.h#L171

Pull Request resolved: pytorch#78136
Approved by: https://github.com/seemethere

Co-authored-by: Nikita Shulga <[email protected]>
@WBobby WBobby deleted the rocm5.2_internal_testing branch August 18, 2022 14:19
WBobby pushed a commit that referenced this pull request Jan 3, 2023
This makes the rocm jobs run on master-only. We've been battling queue
times for a few months now
(pytorch#73039). So far we have tried
or investigated:
1. Moving distributed builds to master
2. Moving distributed builds to periodic
3. Only running rocm on a specific set of paths
4. Running multiple jobs on a single rocm host.

Unfortunately, we haven't been able to reduce queuing times to good
levels. As a result, ROCm jobs are the "weightiest" job in PR CI, with
an average TTS of 3.3h (see https://hud.pytorch.org/metrics, panel name
"Job time-to-signal, all branches").

There are two things we haven't tried so far:
1. Running "smoke tests" only on PR
2. Switching rocm builds to master

Since #2 is easiest let's give it a try. For now, the policy would be
the same as what we do for other capacity-constrained configurations
(Win and Mac)—run on master only, but revert if there is a breakage
introduced.

[skip ci]

Pull Request resolved: pytorch#77989

Approved by: https://github.com/malfet, https://github.com/janeyx99
WBobby pushed a commit that referenced this pull request Jan 3, 2023
…78136)

This prevents `import torch` accidentally crash on machines with no metal devices

Should prevent crashes reported in pytorch#77662 (comment) and https://github.com/pytorch/functorch/runs/6560056366?check_suite_focus=true

Backtrace to the crash:
```
(lldb) bt
* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff7202be57 libobjc.A.dylib`objc_msgSend + 23
    frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436
    frame #2: 0x000000010fda011d libtorch_cpu.dylib`_GLOBAL__sub_I_MPSAllocator.mm + 125
    frame ROCm#3: 0x000000010ada81e3 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 535
    frame ROCm#4: 0x000000010ada85ee dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40(lldb) up
frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436
libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl:
->  0x10fd9f524 <+436>: movq   %rax, 0x1b0(%rbx)
    0x10fd9f52b <+443>: movw   $0x0, 0x1b8(%rbx)
    0x10fd9f534 <+452>: addq   $0x8, %rsp
    0x10fd9f538 <+456>: popq   %rbx
(lldb) disassemble
 ...
    0x10fd9f514 <+420>: movq   0xf19ad15(%rip), %rsi     ; "maxBufferLength"
    0x10fd9f51b <+427>: movq   %r14, %rdi
    0x10fd9f51e <+430>: callq  *0xeaa326c(%rip)          ; (void *)0x00007fff7202be40: objc_msgSend
```

which corresponds to `[m_device maxBufferLength]` call, where `m_device` is not initialized in
https://github.com/pytorch/pytorch/blob/2ae3c59e4bcb8e6e75b4a942cacc2d338c88e609/aten/src/ATen/mps/MPSAllocator.h#L171

Pull Request resolved: pytorch#78136
Approved by: https://github.com/seemethere
WBobby pushed a commit that referenced this pull request Jan 3, 2023
… of libtorch_python (pytorch#78028)

Summary:
This moves torch::class_<WorkerInfo> into `rpc_agent.cpp` so it gets registered in libtorch instead of libtorch_python. This is intermediate work to getting torch::deploy to load an unmodified copy of libtorch. Current RPC is incompatible due to duplicate registrations.

```
unknown file: Failure
C++ exception with description "Exception Caught inside torch::deploy embedded library:
Custom class with name __torch__.torch.classes.dist_rpc.WorkerInfo is already registered. Ensure that registration with torch::class_ is only called once.
Exception raised from registerCustomClass at ../aten/src/ATen/core/custom_class.cpp:61 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7f3bd9adb92e in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7f3bd9ab7068 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #2: torch::registerCustomClass(std::shared_ptr<c10::ClassType>) + 0x110 (0x7f3bc2258980 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame ROCm#3: torch::detail::class_base::class_base(std::string const&, std::string const&, std::string, std::type_info const&, std::type_info const&) + 0x3b9 (0x7f3bc225a419 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame ROCm#4: [0x7f3ba45cfea1]
frame ROCm#5: <unknown function> + 0x1b5334 (0x5652bdab9334 in ./test_deploy)
frame ROCm#6: <unknown function> + 0x1b4f3e (0x5652bdab8f3e in ./test_deploy)
frame ROCm#7: <unknown function> + 0x1b519b (0x5652bdab919b in ./test_deploy)
frame ROCm#8: loadSearchFile(char const*) + 0x23e (0x7f3ba62f37f8 in /tmp/torch_deploy9ATEFg)
frame ROCm#9: deploy_set_self + 0x51 (0x7f3ba62f38f9 in /tmp/torch_deploy9ATEFg)
frame ROCm#10: torch::deploy::Interpreter::Interpreter(torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>) + 0x274 (0x5652bdaaa790 in ./test_deploy)
frame ROCm#11: void __gnu_cxx::new_allocator<torch::deploy::Interpreter>::construct<torch::deploy::Interpreter, torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(torch::deploy::Interpreter*, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0x81 (0x5652bdaaf58b in ./test_deploy)
frame ROCm#12: void std::allocator_traits<std::allocator<torch::deploy::Interpreter> >::construct<torch::deploy::Interpreter, torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(std::allocator<torch::deploy::Interpreter>&, torch::deploy::Interpreter*, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0x4a (0x5652bdaae320 in ./test_deploy)
frame ROCm#13: void std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> >::_M_realloc_insert<torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(__gnu_cxx::__normal_iterator<torch::deploy::Interpreter*, std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> > >, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0xee (0x5652bdaae4a0 in ./test_deploy)
frame ROCm#14: void std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> >::emplace_back<torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0xb6 (0x5652bdaad258 in ./test_deploy)
frame ROCm#15: torch::deploy::InterpreterManager::InterpreterManager(unsigned long, std::shared_ptr<torch::deploy::Environment>) + 0x123 (0x5652bdaa83b1 in ./test_deploy)
frame ROCm#16: TorchpyTest_InitTwice_Test::TestBody() + 0x65 (0x5652bda075a9 in ./test_deploy)
frame ROCm#17: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x65 (0x5652bda944b7 in ./test_deploy)
frame ROCm#18: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x5a (0x5652bda8cfe7 in ./test_deploy)
frame ROCm#19: testing::Test::Run() + 0x100 (0x5652bda68622 in ./test_deploy)
frame ROCm#20: testing::TestInfo::Run() + 0x10f (0x5652bda68fb3 in ./test_deploy)
frame ROCm#21: testing::TestSuite::Run() + 0x121 (0x5652bda6980d in ./test_deploy)
frame ROCm#22: testing::internal::UnitTestImpl::RunAllTests() + 0x38e (0x5652bda756e6 in ./test_deploy)
frame ROCm#23: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 0x65 (0x5652bda9586b in ./test_deploy)
frame ROCm#24: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 0x5a (0x5652bda8e0f7 in ./test_deploy)
frame ROCm#25: testing::UnitTest::Run() + 0xc9 (0x5652bda73fd1 in ./test_deploy)
frame ROCm#26: RUN_ALL_TESTS() + 0x11 (0x5652bda169fa in ./test_deploy)
frame ROCm#27: main + 0x27 (0x5652bda10ce2 in ./test_deploy)
frame ROCm#28: <unknown function> + 0x2d310 (0x7f3bc0431310 in /usr/lib/libc.so.6)
frame ROCm#29: __libc_start_main + 0x81 (0x7f3bc04313c1 in /usr/lib/libc.so.6)
frame ROCm#30: _start + 0x25 (0x5652bda063b5 in ./test_deploy)
```

Test Plan: CI

Differential Revision: D36564258

Pull Request resolved: pytorch#78028
Approved by: https://github.com/rohan-varma
WBobby pushed a commit that referenced this pull request Jan 3, 2023
… to conform with non-quantized countertpart filenames

Summary:
Names of analogous files in quantized directory (previously snake case) were inconsistent with
their non-quantized filename counterparts (pascal case). This is the first of a series of PRs that changes
all files in quantized (and sub-directories) dir to have pascal case.

`aten/src/ATen/native/quantized/qconv_unpack.cpp` has not been renamed yet
because (for reasons currently unknown) after making the name change, `import torch` produces the below error (`qlinear_unpack.cpp` renaming also seems to fail some phabricator CI tests for similar reasons). We suspect that these may be undefined errors and will revisit naming these files in a future PR.

```
terminate called after throwing an instance of 'c10::Error'
  what():  Type c10::intrusive_ptr<ConvPackedParamsBase<2> > could not be converted to any of the known types.
Exception raised from operator() at ../aten/src/ATen/core/jit_type.h:1735 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x55 (0x7f26745c0c65 in /data/users/dzdang/pytorch/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xb1 (0x7f26745bdcd1 in /data/users/dzdang/pytorch/torch/lib/libc10.so)
frame #2: <unknown function> + 0x1494e24 (0x7f2663b14e24 in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame ROCm#3: <unknown function> + 0xfed0bc (0x7f266366d0bc in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame ROCm#4: c10::detail::infer_schema::make_function_schema(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 0x5a (0x7f266366d71a in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame ROCm#5: c10::detail::infer_schema::make_function_schema(c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 0x7b (0x7f266366e06b in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame ROCm#6: <unknown function> + 0x1493f32 (0x7f2663b13f32 in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame ROCm#7: <unknown function> + 0xe227dd (0x7f26634a27dd in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame ROCm#8: <unknown function> + 0x14e0a (0x7f268c934e0a in /lib64/ld-linux-x86-64.so.2)
..........................truncated.............
```

Test Plan:
```
python test/test_quantization.py
```

Pull Request resolved: pytorch#77037

Approved by: https://github.com/jerryzh168
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Debug with ssh instructions incorrect on CONTRIBUTING.md