Skip to content

Bad import in test_torchinductor and skip torchvision related UT #1374

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pragupta
Copy link

@pragupta pragupta commented Mar 19, 2024

Copy link
Collaborator

@pruthvistony pruthvistony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to skip this test for now.
Since this is an upstream problem, should we raise to upstream and work on the fix?

@pruthvistony pruthvistony merged commit c838806 into ROCm:rocm6.2_internal_testing Mar 20, 2024
@pragupta
Copy link
Author

Opened an issue in my backlog to track torchvision UT https://github.com/ROCm/frameworks-internal/issues/7549

jithunnair-amd pushed a commit that referenced this pull request Oct 3, 2024
========================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda
jithunnair-amd pushed a commit that referenced this pull request Oct 11, 2024
========================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda
jithunnair-amd pushed a commit that referenced this pull request Oct 11, 2024
=================================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda

Enable test_public_api_surface (#1601)

Fixes SWDEV-462410.

Enable this unit test since PyTorch issue
pytorch#104012 has been closed. This
unit test runs fine on MI100/MI300 and upstream.

(cherry picked from commit 0001d4ab5070635cfecc146ee299bbb9fa45ca67)

[rocm6.3_internal_testing] Fixed error string assertion in test_invalid_devices (#1607)

Fixes pytorch#8974

(cherry picked from commit a688e0a)
jithunnair-amd pushed a commit that referenced this pull request Oct 11, 2024
=================================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda

Enable test_public_api_surface (#1601)

Fixes SWDEV-462410.

Enable this unit test since PyTorch issue
pytorch#104012 has been closed. This
unit test runs fine on MI100/MI300 and upstream.

(cherry picked from commit 0001d4ab5070635cfecc146ee299bbb9fa45ca67)

[rocm6.3_internal_testing] Fixed error string assertion in test_invalid_devices (#1607)

Fixes pytorch#8974

(cherry picked from commit a688e0a)
jithunnair-amd pushed a commit that referenced this pull request Nov 19, 2024
=================================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda

Enable test_public_api_surface (#1601)

Fixes SWDEV-462410.

Enable this unit test since PyTorch issue
pytorch#104012 has been closed. This
unit test runs fine on MI100/MI300 and upstream.

(cherry picked from commit 0001d4ab5070635cfecc146ee299bbb9fa45ca67)

[rocm6.3_internal_testing] Fixed error string assertion in test_invalid_devices (#1607)

Fixes pytorch#8974

(cherry picked from commit a688e0a)
pruthvistony added a commit that referenced this pull request Dec 2, 2024
=================================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda

Enable test_public_api_surface (#1601)

Fixes SWDEV-462410.

Enable this unit test since PyTorch issue
pytorch#104012 has been closed. This
unit test runs fine on MI100/MI300 and upstream.

(cherry picked from commit 0001d4ab5070635cfecc146ee299bbb9fa45ca67)

[rocm6.3_internal_testing] Fixed error string assertion in test_invalid_devices (#1607)

Fixes pytorch#8974

(cherry picked from commit a688e0a)
pruthvistony added a commit that referenced this pull request Dec 21, 2024
=================================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda

Enable test_public_api_surface (#1601)

Fixes SWDEV-462410.

Enable this unit test since PyTorch issue
pytorch#104012 has been closed. This
unit test runs fine on MI100/MI300 and upstream.

(cherry picked from commit 0001d4ab5070635cfecc146ee299bbb9fa45ca67)

[rocm6.3_internal_testing] Fixed error string assertion in test_invalid_devices (#1607)

Fixes pytorch#8974

(cherry picked from commit a688e0a)
dnikolaev-amd pushed a commit that referenced this pull request Apr 17, 2025
=================================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda

Enable test_public_api_surface (#1601)

Fixes SWDEV-462410.

Enable this unit test since PyTorch issue
pytorch#104012 has been closed. This
unit test runs fine on MI100/MI300 and upstream.

(cherry picked from commit 0001d4ab5070635cfecc146ee299bbb9fa45ca67)

[rocm6.3_internal_testing] Fixed error string assertion in test_invalid_devices (#1607)

Fixes pytorch#8974

(cherry picked from commit a688e0a)
(cherry picked from commit b966e44)
dnikolaev-amd pushed a commit that referenced this pull request Apr 24, 2025
=================================================

Temporarily skip test_conv3d_64bit_indexing

- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218

Skip ddp apply_optim_in_bwd tests for gloo (#1302)

To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834

Add skipIfRocmArch decorator for Navi skips (#1356)

Converted NAVI check as a function (#1364)

* Moved NAVI check to the test file

* Revised NAVI check as a function

[Navi] [Inductor] Unskip Navi inductor UTs (#1514)

Relates to https://ontrack-internal.amd.com/browse/SWDEV-461590

Bad import in test_torchinductor and skip torchvision related UT (#1374)

skip test_inductor_freezing failing UTs (#1375)

Skip test_mm_triton_kernel_benchmark (#1376)

* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py

skip vmapvjpvjp_linalg_householder_product_cuda_float32 (#1420)

skipIfRocm needs msg parameter

[NO CP] Updated changes to skip few UTs

Imported skipIfRocm in certain test suites (#1577)

Fixes SWDEV-472397

Added functions imports (#1521)

Fixes
inductor.test_torchinductor_dynamic_shapes::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda

Enable test_public_api_surface (#1601)

Fixes SWDEV-462410.

Enable this unit test since PyTorch issue
pytorch#104012 has been closed. This
unit test runs fine on MI100/MI300 and upstream.

(cherry picked from commit 0001d4ab5070635cfecc146ee299bbb9fa45ca67)

[rocm6.3_internal_testing] Fixed error string assertion in test_invalid_devices (#1607)

Fixes pytorch#8974

(cherry picked from commit a688e0a)
(cherry picked from commit b966e44)

[rocm6.4_internal_testing] Skip non_standard_bool_values tests (#1880)

Fixes SWDEV-509757

(cherry picked from commit 80b4c41)

[rocm6.4_internal_testing] [NAVI32] Skipped sdpa_2 test in test_aot_inductor for Navi32 (#1882)

The test fails with assertion error "Tensors are not close"

After testing I can confirm that this issue is caused by eager mode
execution specific to navi32 during the test_sdpa_2 run. Made a cross
reference between navi31, navi32 and mi300. AOTInductor results are all
the exact same for all of the archs, only the eager mode fails here for
navi32 with 1.5% difference in tensor values from the gpu run. I assume
that this happens due to fp16-32-16 conversions in eager mode or missing
some if-statements for navi32 specifically.

Simple reproducer to check the values for cpu/gpu/eager/aoti runs.

[gfx1101_test_sdpa_2_issue_reproducer.txt](https://github.com/user-attachments/files/18676367/gfx1101_test_sdpa_2_issue_reproducer.txt)

(cherry picked from commit 896c789)

Fixed rocm skip import issue (#1949)

skip_if_rocm does not exist in
torch/testing/_internal/common_distributed.py. Use skipIfRocm from
torch/testing/_internal/common_utils.py instead.

(cherry picked from commit cfb673e)

Skip certain unit tests on NAVI (#1950)

This PR is to skip certain unit tests on NAVI only.
Fixes SWDEV-509011 - test_sac_ilp.py::TestSACILP::test_sac_ilp_case1
Fixes SWDEV-509311 -
test_max_autotune.py::TestMaxAutotune::test_non_contiguous_input_addmm
Fixes SWDEV-510738
test_fsdp_sharded_grad_scaler.py::TestShardedGradScalerParityWithDDP::test_sharded_grad_scaler_found_inf

(cherry picked from commit e86291a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants