forked from ROCm/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 0
Add ROCm5.2.3/AMDGPU support for PyTorch #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: X-link: pytorch/pytorch-canary#82 This will allow us to enable co-development merges between phabricator and GitHub Pull Request resolved: pytorch#75226 Reviewed By: malfet, seemethere Differential Revision: D35375458 Pulled By: bigfootjon fbshipit-source-id: e25f35e02b404850132c3972744202d27a18d8aa (cherry picked from commit 957c313)
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo Pull Request resolved: pytorch#75081 Approved by: https://github.com/atalman
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo Pull Request resolved: pytorch#75080 Approved by: https://github.com/atalman
Was noticing longer than average queuing times for linux.2xlarge and also found out that we're hitting our max limit more often than not so bumping this to 750 to give us more capacity to play around with. Signed-off-by: Eli Uriegas <[email protected]> <details> <summary> Number of times we've hit this in the last week </summary>  Query: https://fburl.com/6cst46y0 </summary> Pull Request resolved: pytorch#75234 Approved by: https://github.com/kit1980, https://github.com/osalpekar, https://github.com/malfet
Fixes #ISSUE_NUMBER Pull Request resolved: pytorch#75229 Approved by: https://github.com/seemethere, https://github.com/bigfootjon
…positional args (pytorch#75146) Summary: Pull Request resolved: pytorch#75146 Previously we assume `to` must be called with positioanl args, but this may not be the case, e.g. we can do `to(dtype=?)` or `to(memory_format=?)` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: ejguan Differential Revision: D35342088 fbshipit-source-id: 22bfe78ae84e74141ae6560285c5c38bc068c999 (cherry picked from commit a3593c0)
Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: pytorch#75187 Approved by: https://github.com/zou3519
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo Pull Request resolved: pytorch#75083 Approved by: https://github.com/ngimel
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo Pull Request resolved: pytorch#75079 Approved by: https://github.com/albanD
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo Pull Request resolved: pytorch#75082 Approved by: https://github.com/ngimel
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo Pull Request resolved: pytorch#75084 Approved by: https://github.com/ngimel
Summary: Pull Request resolved: pytorch#75237 applies 'OVRSOURCE' logic to one more place missed in D35331263 (pytorch@8b7e2bf) so that lazy TS backend is not compiled in internal builds Test Plan: CI Reviewed By: malfet, shunting314 Differential Revision: D35377758 fbshipit-source-id: 5dcd3d36e50a8917470a917f2120353972dc31ba (cherry picked from commit 8b8ed7b)
Pull Request resolved: pytorch#75214 Approved by: https://github.com/albanD
Summary: Pull Request resolved: pytorch#74845 This PR adds support for quantization flow to detect parametrized modules and match them using their original module types. This mainly involved using the new type_before_parametrizations function rather than type to check for module mathcing Test Plan: python test/test_ao_sparsity.py TestComposability Imported from OSS Reviewed By: jerryzh168 Differential Revision: D35240274 fbshipit-source-id: 7294d89c9c2e069e51d8b9bafa45c15f92bed124 (cherry picked from commit ed5cdb7)
Summary: Pull Request resolved: pytorch#74560 This PR add support for quantized tensors with "unknown quantizer", which means that we can use standard APIs like torch.empty to allocate quantized tensors, with the understanding that we will set the quantizer later. This makes meta functions applicable to quantized tensors (they will allocate with unknown quantizer and the kernel will set the quantizer later) and fixes a bug David Dang reported where structured kernels give a weird error message when you call them with quantized inputs. This is not a complete support for quantized structured kernels because I haven't actually tried porting any of the quantized implementations to structured; qadd is probably a good choice to try first as it does its broadcasting implementation using TensorIterator. My goal here is just to show that the error message is better. See also pytorch#52680 Signed-off-by: Edward Z. Yang <ezyangfb.com> Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D35317441 Pulled By: dzdang fbshipit-source-id: ffb85b0e06ccbcc2b01052ca6760517684048b39 (cherry picked from commit 2a54b8b)
…ytorch#74878) Summary: Pull Request resolved: pytorch#74878 Previously we record the matched node as a list of nodes: `List[Node]`, this does not generalize to a graph, which is needed for future use cases, in this PR we changed the recorded node as NodePattern instead, currently defined as ``` NodePattern = Union[Tuple[Node, Node], Tuple[Node, Tuple[Node, Node]], Any] ``` but can be more general. This will allow us to support more general patterns with backend_config_dict api, and is also needed for BinaryOpQuantizeHandler refactor Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35203616 fbshipit-source-id: f4bf5b056cfc0955455eea9c2bf1ac9f6dde3974 (cherry picked from commit b290c04)
…on)" Summary: Original commit changeset: 426a07808035 Original Phabricator Diff: D34943147 (pytorch@8d7242a) Since D34943147 (pytorch@8d7242a) landed, Adfinder push candidates show consistently push blocking red counters for getAds C CPU main thread and getAds NC CPU main thread. AF auto prod canary for D34943147 (pytorch@8d7242a), c1-c2 does shows 1.19% regression for counter 'getAds C CPU main thread' and ~1% regression for counter 'getAds C CPU main thread': https://www.internalfb.com/intern/experiment_store/experiment/27487791896054/#commit1-commit2 To help unblock adfinder push, reverting D34943147 (pytorch@8d7242a) Test Plan: Canary: https://our.intern.facebook.com/intern/ads/canary/442677925633895915 Canary completed: https://www.internalfb.com/intern/experiment_store/experiment/25288768753864/#commit1-commit2 Counter 'getAds C CPU main thread' moves in the opposite direction by -0.75. Differential Revision: D35370901 fbshipit-source-id: b2e89f5976eb3fa2c2b22f120c0e32e380f5bc52 (cherry picked from commit 1eb14fe)
As pointed by pytorch#71205, `torch.hub.load` assumes that the user trusts the repo from where the code is gathered and exececuted. We propose a solution to make sure that the user is aware of the security threat that this can represent. **Solution**: Adds a `trust_repo` parameter to the `load`, `list` and `help` functions in torch.hub. For now, the default `trust_repo=None` warns that, in the future, the user will need to authorize explicitly every repo before downloading it. Once the repo has been trusted (via `trust_repo=True` or via a command prompt input) it will be added to the list of trusted repositories. Pull Request resolved: pytorch#72060 Approved by: https://github.com/NicolasHug
Summary: Pull Request resolved: pytorch#75244 Original commit changeset: d653a5af662a Original Phabricator Diff: D35060736 (pytorch@d9d3492) Test Plan: Model loading test, verified that D35060736 (pytorch@d9d3492) will cause the torch::save => torch::load failure. Reviewed By: yinghai, jianyuh Differential Revision: D35387009 fbshipit-source-id: 9d176992d402d57779e2af3d905b3c1538335298 (cherry picked from commit 6c8cc0d)
When start_val == 0, using the comparison `start_val > self[dim]` can be folded easily (0 is never strictly greater than the result of `self[dim]`), but `start_val >= self[dim]` can't. Since we assign `start_val = sef[dim]` in the body anyway, both these are equivalent Pull Request resolved: pytorch#74980 Approved by: https://github.com/eellison
It caused a number of internal only compilation failures, for example see: pytorch#74425 (comment) and pytorch#74542 (comment) Pull Request resolved: pytorch#75085 Approved by: https://github.com/ngimel, https://github.com/albanD
Summary: Pull Request resolved: pytorch#74946 Warn instead of hard failure when fail to clone state_dict, as this param might not be managed by FSDP and thus we do not expect to clone it. ghstack-source-id: 152978204 Test Plan: CI Reviewed By: mrshenli Differential Revision: D35242306 fbshipit-source-id: d9eb58a2993341040e4a9f36fa388f423bd2ddc5 (cherry picked from commit 6b0d080)
Hoping to fix regression from https://hud.pytorch.org/minihud#1bcae0d10e1c4eddf07f9e60ced9b4f3c2c04b1f Adding quantized::softmax to list until 4/15/22. Pull Request resolved: pytorch#75254 Approved by: https://github.com/albanD
Summary: Pull Request resolved: pytorch#75243 Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D35384883 Pulled By: awgu fbshipit-source-id: 8dfc12035b79861df093d5921ed7b36050c9f3a0 (cherry picked from commit 6991467)
Pull Request resolved: pytorch#74607 Approved by: https://github.com/cpuhrsch
Fixes pytorch#68621 Pull Request resolved: pytorch#73686 Approved by: https://github.com/IvanYashchuk, https://github.com/malfet
Reference: pytorch#71108 Pull Request resolved: pytorch#75013 Approved by: https://github.com/anjali411
Pull Request resolved: pytorch#75233 Approved by: https://github.com/ezyang, https://github.com/larryliu0820
Updates our s3 actions to upload and download artifacts to versions that include runAttempt in the prefix for the artifact. This change is mostly to make it so that subsequent re-runs of a workflow do not attempt to grab artifacts from previous runs Coincides with: * seemethere/upload-artifact-s3#4 * seemethere/download-artifact-s3#1 Signed-off-by: Eli Uriegas <[email protected]> Pull Request resolved: pytorch#74576 Approved by: https://github.com/malfet, https://github.com/janeyx99
Pull Request resolved: pytorch#75212 Approved by: https://github.com/cpuhrsch
…-04-11 IFU-master-2022-04-11
Skipped the failing tests on ROCm during IFU-master-2022-04-11
As per pytorch#74995, the tests needs to be skipped for odd WORLD_SIZE Signed-off-by: Jagadish Krishnamoorthy <[email protected]> Fixes pytorch#74995 Pull Request resolved: pytorch#76136 Approved by: https://github.com/kumpera, https://github.com/wayi1
SortImpl.cu needs to include <thrust/execution_policy.h> for thrust::host. Depending on the nvidia/thrust or rocThrust version, transitive inclusion of this header is not guaranteed.
Signed-off-by: Jagadish Krishnamoorthy <[email protected]>
Change the rtol level Signed-off-by: Jagadish Krishnamoorthy <[email protected]>
…el_test [ROCm] Disable TestDataParallelDeviceType tests
To protect CI from sudden version updates, that are not compatible with other packages Fixes pytorch#78362 Pull Request resolved: pytorch#78369 Approved by: https://github.com/suo, https://github.com/atalman
Co-authored-by: Wang, Yanyao <[email protected]>
Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: pytorch#78804 Approved by: https://github.com/janeyx99
Co-authored-by: Wang, Yanyao <[email protected]>
Increase system memory requirement for TestShapeOpsCUDA.test_flip_large_tensor_cuda Signed-off-by: Jagadish Krishnamoorthy <[email protected]>
…ROCm#1032) Signed-off-by: Jagadish Krishnamoorthy <[email protected]>
* Fix baseurl link in CentOS for ROCm5.2 * Add ROCm5.2.1/AMDGPU support Co-authored-by: Wang, Yanyao <[email protected]>
…m5.2_internal_testing
no use it. |
WBobby
pushed a commit
that referenced
this pull request
Aug 18, 2022
…78136) (pytorch#78204) This prevents `import torch` accidentally crash on machines with no metal devices Should prevent crashes reported in pytorch#77662 (comment) and https://github.com/pytorch/functorch/runs/6560056366?check_suite_focus=true Backtrace to the crash: ``` (lldb) bt * thread #1, stop reason = signal SIGSTOP * frame #0: 0x00007fff7202be57 libobjc.A.dylib`objc_msgSend + 23 frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436 frame #2: 0x000000010fda011d libtorch_cpu.dylib`_GLOBAL__sub_I_MPSAllocator.mm + 125 frame ROCm#3: 0x000000010ada81e3 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 535 frame ROCm#4: 0x000000010ada85ee dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40(lldb) up frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl: -> 0x10fd9f524 <+436>: movq %rax, 0x1b0(%rbx) 0x10fd9f52b <+443>: movw $0x0, 0x1b8(%rbx) 0x10fd9f534 <+452>: addq $0x8, %rsp 0x10fd9f538 <+456>: popq %rbx (lldb) disassemble ... 0x10fd9f514 <+420>: movq 0xf19ad15(%rip), %rsi ; "maxBufferLength" 0x10fd9f51b <+427>: movq %r14, %rdi 0x10fd9f51e <+430>: callq *0xeaa326c(%rip) ; (void *)0x00007fff7202be40: objc_msgSend ``` which corresponds to `[m_device maxBufferLength]` call, where `m_device` is not initialized in https://github.com/pytorch/pytorch/blob/2ae3c59e4bcb8e6e75b4a942cacc2d338c88e609/aten/src/ATen/mps/MPSAllocator.h#L171 Pull Request resolved: pytorch#78136 Approved by: https://github.com/seemethere Co-authored-by: Nikita Shulga <[email protected]>
WBobby
pushed a commit
that referenced
this pull request
Jan 3, 2023
This makes the rocm jobs run on master-only. We've been battling queue times for a few months now (pytorch#73039). So far we have tried or investigated: 1. Moving distributed builds to master 2. Moving distributed builds to periodic 3. Only running rocm on a specific set of paths 4. Running multiple jobs on a single rocm host. Unfortunately, we haven't been able to reduce queuing times to good levels. As a result, ROCm jobs are the "weightiest" job in PR CI, with an average TTS of 3.3h (see https://hud.pytorch.org/metrics, panel name "Job time-to-signal, all branches"). There are two things we haven't tried so far: 1. Running "smoke tests" only on PR 2. Switching rocm builds to master Since #2 is easiest let's give it a try. For now, the policy would be the same as what we do for other capacity-constrained configurations (Win and Mac)—run on master only, but revert if there is a breakage introduced. [skip ci] Pull Request resolved: pytorch#77989 Approved by: https://github.com/malfet, https://github.com/janeyx99
WBobby
pushed a commit
that referenced
this pull request
Jan 3, 2023
…78136) This prevents `import torch` accidentally crash on machines with no metal devices Should prevent crashes reported in pytorch#77662 (comment) and https://github.com/pytorch/functorch/runs/6560056366?check_suite_focus=true Backtrace to the crash: ``` (lldb) bt * thread #1, stop reason = signal SIGSTOP * frame #0: 0x00007fff7202be57 libobjc.A.dylib`objc_msgSend + 23 frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436 frame #2: 0x000000010fda011d libtorch_cpu.dylib`_GLOBAL__sub_I_MPSAllocator.mm + 125 frame ROCm#3: 0x000000010ada81e3 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 535 frame ROCm#4: 0x000000010ada85ee dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40(lldb) up frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl: -> 0x10fd9f524 <+436>: movq %rax, 0x1b0(%rbx) 0x10fd9f52b <+443>: movw $0x0, 0x1b8(%rbx) 0x10fd9f534 <+452>: addq $0x8, %rsp 0x10fd9f538 <+456>: popq %rbx (lldb) disassemble ... 0x10fd9f514 <+420>: movq 0xf19ad15(%rip), %rsi ; "maxBufferLength" 0x10fd9f51b <+427>: movq %r14, %rdi 0x10fd9f51e <+430>: callq *0xeaa326c(%rip) ; (void *)0x00007fff7202be40: objc_msgSend ``` which corresponds to `[m_device maxBufferLength]` call, where `m_device` is not initialized in https://github.com/pytorch/pytorch/blob/2ae3c59e4bcb8e6e75b4a942cacc2d338c88e609/aten/src/ATen/mps/MPSAllocator.h#L171 Pull Request resolved: pytorch#78136 Approved by: https://github.com/seemethere
WBobby
pushed a commit
that referenced
this pull request
Jan 3, 2023
… of libtorch_python (pytorch#78028) Summary: This moves torch::class_<WorkerInfo> into `rpc_agent.cpp` so it gets registered in libtorch instead of libtorch_python. This is intermediate work to getting torch::deploy to load an unmodified copy of libtorch. Current RPC is incompatible due to duplicate registrations. ``` unknown file: Failure C++ exception with description "Exception Caught inside torch::deploy embedded library: Custom class with name __torch__.torch.classes.dist_rpc.WorkerInfo is already registered. Ensure that registration with torch::class_ is only called once. Exception raised from registerCustomClass at ../aten/src/ATen/core/custom_class.cpp:61 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7f3bd9adb92e in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7f3bd9ab7068 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libc10.so) frame #2: torch::registerCustomClass(std::shared_ptr<c10::ClassType>) + 0x110 (0x7f3bc2258980 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so) frame ROCm#3: torch::detail::class_base::class_base(std::string const&, std::string const&, std::string, std::type_info const&, std::type_info const&) + 0x3b9 (0x7f3bc225a419 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so) frame ROCm#4: [0x7f3ba45cfea1] frame ROCm#5: <unknown function> + 0x1b5334 (0x5652bdab9334 in ./test_deploy) frame ROCm#6: <unknown function> + 0x1b4f3e (0x5652bdab8f3e in ./test_deploy) frame ROCm#7: <unknown function> + 0x1b519b (0x5652bdab919b in ./test_deploy) frame ROCm#8: loadSearchFile(char const*) + 0x23e (0x7f3ba62f37f8 in /tmp/torch_deploy9ATEFg) frame ROCm#9: deploy_set_self + 0x51 (0x7f3ba62f38f9 in /tmp/torch_deploy9ATEFg) frame ROCm#10: torch::deploy::Interpreter::Interpreter(torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>) + 0x274 (0x5652bdaaa790 in ./test_deploy) frame ROCm#11: void __gnu_cxx::new_allocator<torch::deploy::Interpreter>::construct<torch::deploy::Interpreter, torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(torch::deploy::Interpreter*, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0x81 (0x5652bdaaf58b in ./test_deploy) frame ROCm#12: void std::allocator_traits<std::allocator<torch::deploy::Interpreter> >::construct<torch::deploy::Interpreter, torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(std::allocator<torch::deploy::Interpreter>&, torch::deploy::Interpreter*, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0x4a (0x5652bdaae320 in ./test_deploy) frame ROCm#13: void std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> >::_M_realloc_insert<torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(__gnu_cxx::__normal_iterator<torch::deploy::Interpreter*, std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> > >, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0xee (0x5652bdaae4a0 in ./test_deploy) frame ROCm#14: void std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> >::emplace_back<torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0xb6 (0x5652bdaad258 in ./test_deploy) frame ROCm#15: torch::deploy::InterpreterManager::InterpreterManager(unsigned long, std::shared_ptr<torch::deploy::Environment>) + 0x123 (0x5652bdaa83b1 in ./test_deploy) frame ROCm#16: TorchpyTest_InitTwice_Test::TestBody() + 0x65 (0x5652bda075a9 in ./test_deploy) frame ROCm#17: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x65 (0x5652bda944b7 in ./test_deploy) frame ROCm#18: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x5a (0x5652bda8cfe7 in ./test_deploy) frame ROCm#19: testing::Test::Run() + 0x100 (0x5652bda68622 in ./test_deploy) frame ROCm#20: testing::TestInfo::Run() + 0x10f (0x5652bda68fb3 in ./test_deploy) frame ROCm#21: testing::TestSuite::Run() + 0x121 (0x5652bda6980d in ./test_deploy) frame ROCm#22: testing::internal::UnitTestImpl::RunAllTests() + 0x38e (0x5652bda756e6 in ./test_deploy) frame ROCm#23: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 0x65 (0x5652bda9586b in ./test_deploy) frame ROCm#24: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 0x5a (0x5652bda8e0f7 in ./test_deploy) frame ROCm#25: testing::UnitTest::Run() + 0xc9 (0x5652bda73fd1 in ./test_deploy) frame ROCm#26: RUN_ALL_TESTS() + 0x11 (0x5652bda169fa in ./test_deploy) frame ROCm#27: main + 0x27 (0x5652bda10ce2 in ./test_deploy) frame ROCm#28: <unknown function> + 0x2d310 (0x7f3bc0431310 in /usr/lib/libc.so.6) frame ROCm#29: __libc_start_main + 0x81 (0x7f3bc04313c1 in /usr/lib/libc.so.6) frame ROCm#30: _start + 0x25 (0x5652bda063b5 in ./test_deploy) ``` Test Plan: CI Differential Revision: D36564258 Pull Request resolved: pytorch#78028 Approved by: https://github.com/rohan-varma
WBobby
pushed a commit
that referenced
this pull request
Jan 3, 2023
… to conform with non-quantized countertpart filenames Summary: Names of analogous files in quantized directory (previously snake case) were inconsistent with their non-quantized filename counterparts (pascal case). This is the first of a series of PRs that changes all files in quantized (and sub-directories) dir to have pascal case. `aten/src/ATen/native/quantized/qconv_unpack.cpp` has not been renamed yet because (for reasons currently unknown) after making the name change, `import torch` produces the below error (`qlinear_unpack.cpp` renaming also seems to fail some phabricator CI tests for similar reasons). We suspect that these may be undefined errors and will revisit naming these files in a future PR. ``` terminate called after throwing an instance of 'c10::Error' what(): Type c10::intrusive_ptr<ConvPackedParamsBase<2> > could not be converted to any of the known types. Exception raised from operator() at ../aten/src/ATen/core/jit_type.h:1735 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x55 (0x7f26745c0c65 in /data/users/dzdang/pytorch/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xb1 (0x7f26745bdcd1 in /data/users/dzdang/pytorch/torch/lib/libc10.so) frame #2: <unknown function> + 0x1494e24 (0x7f2663b14e24 in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame ROCm#3: <unknown function> + 0xfed0bc (0x7f266366d0bc in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame ROCm#4: c10::detail::infer_schema::make_function_schema(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 0x5a (0x7f266366d71a in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame ROCm#5: c10::detail::infer_schema::make_function_schema(c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 0x7b (0x7f266366e06b in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame ROCm#6: <unknown function> + 0x1493f32 (0x7f2663b13f32 in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame ROCm#7: <unknown function> + 0xe227dd (0x7f26634a27dd in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame ROCm#8: <unknown function> + 0x14e0a (0x7f268c934e0a in /lib64/ld-linux-x86-64.so.2) ..........................truncated............. ``` Test Plan: ``` python test/test_quantization.py ``` Pull Request resolved: pytorch#77037 Approved by: https://github.com/jerryzh168
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
no_sync()
([FSDP] Add grad accumulation withoutno_sync()
pytorch/pytorch#73535)arc lint --take CLANGFORMAT
aten::len
([Static Runtime] Add native op support foraten::len
pytorch/pytorch#73899)lkj_cholesky
device error (Fixlkj_cholesky
device error pytorch/pytorch#73980)arc lint --take CLANGFORMAT
arc lint --take CLANGFORMAT
arc lint --take CLANGFORMAT
output_quantized_idxs
in convert ([quant] Fix implementation foroutput_quantized_idxs
in convert pytorch/pytorch#74140)output_quantized_idxs
in converttorch.nn
importable on Python-3.7.0test_reduce_add_coalesced
failure (Fixtest_reduce_add_coalesced
failure pytorch/pytorch#74027)aten::full
reuses a tensor that does not match requested one ([Static Runtime] Fix a bug thataten::fill
reuses a tensor that does not match requested one pytorch/pytorch#73990)torch.distributions.wishart.Wishart
(Improve numerical stability oftorch.distributions.wishart.Wishart
pytorch/pytorch#72993)get_all_
type macros with the ATen dispatch macros. (Replaceget_all_
type macros with the ATen dispatch macros. pytorch/pytorch#71561)get_all_
type macros with the ATen dispatch macros.output_quantized_idxs
in convert ([quant] Fix implementation foroutput_quantized_idxs
in convert pytorch/pytorch#74140) ([reland][quant] Fix implementation foroutput_quantized_idxs
in convert (#74140) pytorch/pytorch#74229)torch.amax
andtorch.amin
for empty tensors if dim arg not provided. ([fix]torch.amax
andtorch.amin
for empty tensors if dim arg not provided. pytorch/pytorch#73914)c10::errors
from torch::deploy ([torch::deploy] Removec10::errors
from torch::deploy pytorch/pytorch#74283)arc lint --take BLACK
named_parameters()
for clean names insummon_full_params()
([FSDP] Overridenamed_parameters()
for clean names insummon_full_params()
pytorch/pytorch#74333)amin
to structured kernels. ([structured kernels] Portamin
to structured kernels. pytorch/pytorch#73581)GITHUB_DIR
in path generate_ci_workflow.pyarc lint --take CLANGFORMAT
GitHubPR.get_last_comment
only pickle
andpickle + flatbuffer
for migration (Extend _save_for_mobile and _load_for_mobile to support flatbuffer format; Default format is pickle pytorch/pytorch#74209)is_train
flag for onnx pass deduplicate initializersonly pickle
andpickle + flatbuffer
for migration<type>Storage
classes (Virtualize<type>Storage
classes pytorch/pytorch#66970)OpMathType
tensor for intermediate results--force
optionaten::sum
([SR] Eliminate extra permute ops beforeaten::sum
pytorch/pytorch#74481)Lint
to the list of mandatory checksarc lint --take CLANGFORMAT
only pickle
andpickle + flatbuffer
for migration" (Extend _save_for_mobile and _load_for_mobile to work with flatbuffer format pytorch/pytorch#74594)arc lint --take CLANGFORMAT
int[]?
arguments to new OptionalIntArrayRef classasarray
docs + add test case.torch.ravel
!recordThreadInfo
([Profiler] Limit calls torecordThreadInfo
pytorch/pytorch#74888)arc lint --take CLANGFORMAT
numModules
type tounsigned
([Deploy] ChangenumModules
type tounsigned
pytorch/pytorch#74978).pyi.in
files exportable from torch/_C/ folder (Make all.pyi.in
files exportable from torch/_C/ folder pytorch/pytorch#74962)arc lint --take GOOGLEJAVAFORMAT
arc lint --take CLANGFORMAT
grid_sampler
functions (Use the same checks in allgrid_sampler
functions pytorch/pytorch#74635)grid_sampler
functionscholesky_inverse
: complex autograd, forward AD and correct tests.Tensor[]
for structured kernel codegen.grid_sampler
functionsc10d/Utils.hpp
to
is not called with positional args ([quant][fx] Fix lowering pass for cases whento
is not called with positional args pytorch/pytorch#75146)-Wsign-compare
to list of clang flagsaten::full_like
reuses a tensor that does not match arguments ([Static Runtime] Fix a bug thataten::full_like
reuses a tensor that does not match arguments pytorch/pytorch#74255)__torch_function__
as instance method in C++arc lint --take CLANGFORMAT
MultiMarginLoss
on CUDAarc lint --take CLANGFORMAT
log_target
example in kl divergenceisIntegral
rank0_only
tofull_optim_state_dict()
arc lint --take CLANGFORMAT
internal::GRAIN_SIZE
bygrain_size
(parameter). (Replaceinternal::GRAIN_SIZE
bygrain_size
(parameter). pytorch/pytorch#53177)"install_user.sh
compatible with Focal ([CI] Makeinstall_user.sh
compatible with Focal pytorch/pytorch#77622)" commit 6aea0b1Fixes #ISSUE_NUMBER