-
Notifications
You must be signed in to change notification settings - Fork 24.2k
[ROCm] Fix unit tests on CI #11191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCm] Fix unit tests on CI #11191
Conversation
Enable test_optim unit tests …
Merge from upstream
…RAND_PR While there, add the remaining changes requested in upstream PR pytorch#10266
Reported by: bddqqp
Merge from upstream
Refactor unit test skip statements to use @skipIfRocm annotation
Merge from upstream
Replace hcRNG with rocRAND.
fixed merge conflicts.
Fix typo.
…on due to recently observed hang
Skip KLDivLoss_cuda tests due to hang
Merge from upstream
Merge from upstream
Merge from upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
I'm really confused why these tests fail with
Any ideas? |
@ssnl unclear. We are working on unit test pass rate ATM. There are a few things that could interfere here: a) we are aware of a compiler bug (fixed in #11198 ) that can cause hangs and crashes, b) we are also aware of a few tests (min/max in particular) that succeed on our nodes but fail on the CI (we are looking into this). |
Summary: Disables two of the unit tests in test_cuda that got introduced after test_cuda was enabled that fail on ROCm. Pull Request resolved: pytorch#11191 Differential Revision: D9628702 Pulled By: ezyang fbshipit-source-id: 4c298c728f42bb43d39b57967aa3e44385980265
Disables two of the unit tests in test_cuda that got introduced after test_cuda was enabled that fail on ROCm.