-
Notifications
You must be signed in to change notification settings - Fork 68
IFU-master-2022-04-11 #993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If __torch_function__ was disabled, this TLS should propagate to other threads. Although I was thinking about pytorch#73942 when I did this, this doesn't actually help solve the problem, because when I disable __torch_function__ as part of the disabled __torch_function__ implementation, this is prior to when snapshotting happens (also snapshotting only happens for Python tensors anyway). I intend to add some more TLS to this struct soon, which is why it's a struct and not just a bool. Testing is not so easy to do because on CPU there isn't an easy way to get Python code running in another thread. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: pytorch#75110 Approved by: https://github.com/albanD
Summary: Pull Request resolved: pytorch#75138 (Note: this ignores all push blocking failures!) Test Plan: CI Reviewed By: wconstab Differential Revision: D35331263 fbshipit-source-id: e426c4017359c9f98188c0df5226775be7b1f700 (cherry picked from commit bf1768f)
Partially fixes: pytorch#66328 This PR introduces a templated class `IList<T>`: a wrapper container for boxed (`c10::List<T>`) and unboxed (`at::ArrayRef<T>`) containers. At this point, it was created having `T = Tensor` in mind, but aiming for supporting `T = OptionalTensorRef`, too. Pull Request resolved: pytorch#67964 Approved by: https://github.com/ezyang
…74843) Summary: Pull Request resolved: pytorch#74843 is_output_quantized is used to check if we should quantize the op based on the dtype configuration in qconfig and what is supported by the backend, we'll skip inserting observer if the dtype configuration is not supported by the backend, this is now supported by backend_config_dict, and we can remove this function now. Also we previously supported fp16 static quantization for some ops for one of our internal use case, and now it is not required, so we can remove them Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35190541 fbshipit-source-id: 623d961810737ec01e1f8b269ec48a6a99bb284a (cherry picked from commit a405998)
This PR enables jit-compiled reductions and moves `prod` to be jit-compiled. Currently, only reductions that can use `func_wrapper` for automatic implementation of `reduce/project/translate_idx` opes are supported, there are a few TODOs for support of more complex reductions such as norms and max, that typically require full-fledged ReduceOps functor. Similarly, only reductions with a single input are supported. Number of inputs is hardcoded to 1, which is true for our current reductions, but can be relaxed in the future. Pull Request resolved: pytorch#74446 Approved by: https://github.com/mruberry
Pull Request resolved: pytorch#75105 Approved by: https://github.com/albanD
References: pytorch#13918 Add more test cases for list of numpy array inputs Pull Request resolved: pytorch#72249 Approved by: https://github.com/mruberry
Fixes pytorch#74122 This re-enables TestTorchFunctionOverride and fixes a bunch of test failures that had crept in while it was disabled. Pull Request resolved: pytorch#74202 Approved by: https://github.com/ezyang
…orch#75149) Summary: Pull Request resolved: pytorch#75149 https://github.com/pytorch/rfcs/blob/master/RFC-0017-PyTorch-Operator-Versioning.md ghstack-source-id: 152906910 Test Plan: CI Reviewed By: qihqi Differential Revision: D35338681 fbshipit-source-id: 03cb699696af2c946d67ece95bdc019fc4a4cb11 (cherry picked from commit b72737e)
Summary: Add BFloat16 support for smooth_l1_loss on CPU. Pull Request resolved: pytorch#62558 Reviewed By: H-Huang Differential Revision: D34897859 Pulled By: frank-wei fbshipit-source-id: a52138c89852642db78f5f3083d05873f3cdec3a (cherry picked from commit 71908ee)
Summary: Pull Request resolved: pytorch#75176 Switch to python resources to fix build on buck2 https://www.internalfb.com/intern/wiki/Buck-users/Python_Resources_in_fbcode/ Reviewed By: r-barnes Differential Revision: D35352705 fbshipit-source-id: f85043ebbcfbb30d287c802ff7401c89155a024a (cherry picked from commit 35e7a98)
Test Plan: revert-hammer Differential Revision: D35352705 (pytorch@152489a) Original commit changeset: f85043ebbcfb Original Phabricator Diff: D35352705 (pytorch@152489a) fbshipit-source-id: 901e28dd17150c6300b2d263aba1a8b0651d3020 (cherry picked from commit ab91a2a)
…torch#74636) Summary: Pull Request resolved: pytorch#74636 This commit changes how quantization patterns for linear and conv are set up in prepare. Previously, these were set up through ConvReluQuantizeHandler and LinearReLUQuantizeHandler. After this commit, however, these were set up through the corresponding entries in the native backend_config_dict, rendering the above quantize handlers no longer necessary. In future commits, we will do the same for the remaining ops. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: jerryzh168, ngimel Differential Revision: D35225680 fbshipit-source-id: 4a79f63a11fce46701eb17aaf3619c1e827d72a4 (cherry picked from commit 475f599)
I also took the opportunity to update the documentation a little for clarity. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: pytorch#75141 Approved by: https://github.com/zou3519
The pattern of a PyObject* bundled with a PyInterpreter* is pretty useful in many contexts (e.g., TorchDispatchTypeObject) so I have turned it into a dedicated class SafePyObject. In the process I fixed a bug with the old TorchDispatchTypeObject (copy constructor/assignment was not deleted), made the API more safe (retrieving the PyObject* pointer requires verification that the PyInterpreter* matches) and fixed some minor inefficiencies in C++ code. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: pytorch#75142 Approved by: https://github.com/zou3519
Now there is truly only one way to call __torch_function__ and that is via handle_torch_function_no_python_arg_parser Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: pytorch#75159 Approved by: https://github.com/zou3519
Fixes pytorch#73187. Pull Request resolved: pytorch#75164 Approved by: https://github.com/albanD
Pull Request resolved: pytorch#73441 Approved by: https://github.com/ezyang, https://github.com/albanD
Fixes #ISSUE_NUMBER Pull Request resolved: pytorch#75165 Approved by: https://github.com/seemethere
This PR updates the documentation for the CosineEmbeddingLoss. The loss function uses the `cosine similarity` but in the documentation the term `cosine distance` is used. Therefor the term is changed to `cosine similarity` Fixes pytorch#75104 Pull Request resolved: pytorch#75188 Approved by: https://github.com/cpuhrsch
Make it accept origin environment variable Add explicit skip for pytorch@0a6a1b2 as its a rare case of same commit landed/reverted twice Pull Request resolved: pytorch#75209 Approved by: https://github.com/atalman, https://github.com/bigfootjon
Pull Request resolved: pytorch#72633 Approved by: https://github.com/cpuhrsch
- Fix _Demux can not be pickled with DILL presented pytorch#74958 (comment) - And add cache to traverse function to prevent infinite recursion for circular reference of DataPipe (Fixes pytorch/data#237) Pull Request resolved: pytorch#75034 Approved by: https://github.com/wenleix
Fixes #[74404](pytorch#74404) Pull Request resolved: pytorch#74997 Approved by: https://github.com/albanD
Recently, @cpuhrsch noticed that going to viable/strict still didn't resolve upstream failures for lint. This is because we didn't check out the head SHA for those GHA (we missed it last time). This PR attempts to do some consolidation and fix that problem to make viable/strict more reliable. Pull Request resolved: pytorch#75199 Approved by: https://github.com/cpuhrsch, https://github.com/seemethere, https://github.com/malfet
…ink, hardswish and softplus on CPU (pytorch#63134) Summary: Add BFloat16 support for logsigmoid, hardsigmoid, hardshrink, softshrink, hardswish and softplus on CPU, and optimize the performance of softshrink. Pull Request resolved: pytorch#63134 Reviewed By: yinghai Differential Revision: D34897992 Pulled By: frank-wei fbshipit-source-id: 4c778f5271d6fa54dd78158258941def8d9252f5 (cherry picked from commit decda0e)
Summary: Pull Request resolved: pytorch#73219 Saw a report that this elementwise add is causing overhead. IIUC this is easy to fuse? ghstack-source-id: 152549975 Test Plan: CI, review Ran benchmark_transformers.par mha --batch-size 64 --max-sequence-length 128 --avg-sequence-length 256 --large --use-real-data-distribution --use-mask and looked at the PT time number ``` before: B=64, T=128, Half=True, GPU=True, Seed=1234, Padded tokens=54.92%, Use Mask=True PT Time: 1.24ms, NativePT Time: 1000000000.00ms, HF Time: 1.10ms, PT FLOPS: 59.07TFLOP/s, NativePT FLOPS: 0.00TFLOP/s, HF FLOPS: 66.46TFLOP/s B=64, T=128, Half=True, GPU=True, Seed=1234, Padded tokens=54.92%, Use Mask=True PT Time: 1.23ms, NativePT Time: 1000000000.00ms, HF Time: 1.09ms, PT FLOPS: 59.57TFLOP/s, NativePT FLOPS: 0.00TFLOP/s, HF FLOPS: 66.75TFLOP/s B=64, T=128, Half=True, GPU=True, Seed=1234, Padded tokens=54.92%, Use Mask=True PT Time: 1.24ms, NativePT Time: 1000000000.00ms, HF Time: 1.09ms, PT FLOPS: 58.87TFLOP/s, NativePT FLOPS: 0.00TFLOP/s, HF FLOPS: 66.77TFLOP/s after: B=64, T=128, Half=True, GPU=True, Seed=1234, Padded tokens=54.92%, Use Mask=True PT Time: 1.22ms, NativePT Time: 1000000000.00ms, HF Time: 1.10ms, PT FLOPS: 60.07TFLOP/s, NativePT FLOPS: 0.00TFLOP/s, HF FLOPS: 66.51TFLOP/s B=64, T=128, Half=True, GPU=True, Seed=1234, Padded tokens=54.92%, Use Mask=True PT Time: 1.22ms, NativePT Time: 1000000000.00ms, HF Time: 1.09ms, PT FLOPS: 59.80TFLOP/s, NativePT FLOPS: 0.00TFLOP/s, HF FLOPS: 66.69TFLOP/s B=64, T=128, Half=True, GPU=True, Seed=1234, Padded tokens=54.92%, Use Mask=True PT Time: 1.21ms, NativePT Time: 1000000000.00ms, HF Time: 1.09ms, PT FLOPS: 60.21TFLOP/s, NativePT FLOPS: 0.00TFLOP/s, HF FLOPS: 66.86TFLOP/s ``` Inspected a Kineto trace and confirmed that an elementwise add was fused into baddbmm. Additional opportunity: I see a copy_ inside baddbmm that wasn't happening with the bmm path and I'm not sure why. Perhaps something went wrong with the structured kernels port by ezyang? Reviewed By: ezyang Differential Revision: D34160547 fbshipit-source-id: 78d406fb035e6f3bf13af2c9443a886eada35ac4 (cherry picked from commit aaffc39)
pytorch#75007) Summary: Previously the highest-level process group in `period_process_group_dict` could be `None`, indicating the global group. Now `period_process_group_dict` cannot contain `None` as a process group, so the function `_find_process_group` can just return a process group instead of a tuple -- when not found, just return `None`, because now the returned process group cannot be `None`. Proposal: pytorch#71325 Pull Request resolved: pytorch#75007 Reviewed By: awgu Differential Revision: D35357816 Pulled By: rohan-varma fbshipit-source-id: 4522dba49797df7140227bfd822d668b7e118a66 (cherry picked from commit 77ca01b)
This reverts commit ad028e5. Reverted pytorch#74446 on behalf of https://github.com/seemethere
This reverts commit cda3f58. Reverted pytorch#72633 on behalf of https://github.com/janeyx99
This reverts commit 709fcc8. Reverted pytorch#75293 on behalf of https://github.com/janeyx99
Fixes pytorch#74264 (comment). The shape check works with or without the extras added in pytorch#74264. ```py >>> a = torch.rand(2, 2).to_sparse_csr() >>> b = torch.rand(2, 3).to_sparse_csr() >>> torch.testing.assert_close(a, b) AssertionError: The values for attribute 'shape' do not match: torch.Size([2, 2]) != torch.Size([2, 3]). ``` Tensor comparison is split into two parts: 1. Attribute comparison. 2. Value comparison. https://github.com/pytorch/pytorch/blob/bcf6974c207ac0339bfb8bdfdb0b0ec348f7a22f/torch/testing/_comparison.py#L611-L616 The attribute comparison happens in https://github.com/pytorch/pytorch/blob/bcf6974c207ac0339bfb8bdfdb0b0ec348f7a22f/torch/testing/_comparison.py#L618 The check for the matching shape https://github.com/pytorch/pytorch/blob/bcf6974c207ac0339bfb8bdfdb0b0ec348f7a22f/torch/testing/_comparison.py#L647-L648 is one of the few checks that cannot be disabled through keyword arguments. Thus, there is no need for this check in `_compare_sparse_csr_values` since the comparison will fail before if the shapes mismatch. Pull Request resolved: pytorch#75593 Approved by: https://github.com/cpuhrsch
…erge (pytorch#75542) Summary: Pull Request resolved: pytorch#75542 Reviewed By: malfet Differential Revision: D35513051 fbshipit-source-id: adf59359fcf2410fa8a61746533c896ec22d5ed3 (cherry picked from commit ab65394)
Fix pytorch#75482 There are several random exceptions of adding exclusions in Windows Defender https://github.com/pytorch/pytorch/runs/5953410781?check_suite_focus=true It looks that the Add/Set-MpPreference(added in pytorch#75313) might be unstable. Since it is a defensive step, the exception could be ignored so that the workflow should continue even the command fails. reference: https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_commonparameters?view=powershell-7.2 verification: In the test PR https://github.com/pytorch/pytorch/runs/5966277521?check_suite_focus=true#step:3:54 it tries to delete 2 non-existing processes, but the workflow can continue run. `-ErrorAction Ignore` works in the runner. Pull Request resolved: pytorch#75588 Approved by: https://github.com/suo
Summary: The primary issue for enabling sparsity to work with QAT convert (unlike normal quantization convert) is that when the parametrized module undergoes the QAT convert, the parametrizations need to be maintained. If the parametrizations don't get transfered during the convert, the sparsifier would lose its connection to the model. In practice this was handled using the transfer_parametrizations_and_params function to move the weight and bias and any associated paramerizations to the new module. This PR also adds tests for transfer_parametrizations_and_params and type_before_parametrizations to test_nn.py and also added comments to the test code for composability. Test Plan: python test/test_ao_sparsity.py TestComposability python test/test_nn.py TestNN Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: pytorch#74848 Approved by: https://github.com/vkuzo, https://github.com/Lezcano
test pytorch/apex/torchvision please |
As the CI is on ROCm 5.0, the PyTorch unit tests are tested locally on ROCm 5.1 release image. The below 4 tests fail because of packaging error. They also fail in the rocm5.1 release image with PyT built from current rocm fork master before this IFU. (https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/1530)
The below 2 unit tests fail with proxy related errors. They also fail in the rocm5.1 release image with PyT built from current rocm fork master before this IFU. (https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/1531)
The below 3 tests fail, but there are skipped for now( in #1001) as we are aware of this issue (https://ontrack-internal.amd.com/browse/SWDEV-332522):
Among the distributed unit tests, the below two tests fail but can be ignored as they are disabled in the upstream CI:
|
http://rocmhead.amd.com:8080/job/pytorch/job/pytorch-ci/22/ Apex unit tests look good Torchvision unit tests look good They used 0e487f5 |
test pytorch/apex/torchvision please |
@rraminen , can you push an empty commit to trigger the CI. |
Hi @pruthvistony, I pushed an empty commit. |
No conflict