forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 68
[ROCm] revert cat operator performance work-around #987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
revert d5ca53c (pytorch#46097). The changes only affect ROCm. Reverts a work-around for a compiler performance issue that is no longer needed. `python -m pt.cat_test --tag_filter all --device cuda` ``` OLD Forward Execution Time (us) : 48.833 NEW Forward Execution Time (us) : 8.318 OLD Forward Execution Time (us) : 54.508 NEW Forward Execution Time (us) : 23.824 OLD Forward Execution Time (us) : 52.117 NEW Forward Execution Time (us) : 14.942 OLD Forward Execution Time (us) : 98.790 NEW Forward Execution Time (us) : 74.334 OLD Forward Execution Time (us) : 102.063 NEW Forward Execution Time (us) : 76.008 OLD Forward Execution Time (us) : 167.786 NEW Forward Execution Time (us) : 123.679 OLD Forward Execution Time (us) : 98.320 NEW Forward Execution Time (us) : 67.436 OLD Forward Execution Time (us) : 91.484 NEW Forward Execution Time (us) : 59.230 OLD Forward Execution Time (us) : 109.569 NEW Forward Execution Time (us) : 76.557 OLD Forward Execution Time (us) : 106.603 NEW Forward Execution Time (us) : 87.635 OLD Forward Execution Time (us) : 106.693 NEW Forward Execution Time (us) : 88.902 OLD Forward Execution Time (us) : 110.881 NEW Forward Execution Time (us) : 94.361 OLD Forward Execution Time (us) : 122.925 NEW Forward Execution Time (us) : 123.046 OLD Forward Execution Time (us) : 272.442 NEW Forward Execution Time (us) : 271.932 OLD Forward Execution Time (us) : 457.329 NEW Forward Execution Time (us) : 456.767 OLD Forward Execution Time (us) : 117.688 NEW Forward Execution Time (us) : 87.133 OLD Forward Execution Time (us) : 873.764 NEW Forward Execution Time (us) : 865.075 OLD Forward Execution Time (us) : 1746.831 NEW Forward Execution Time (us) : 1730.252 OLD Forward Execution Time (us) : 2619.303 NEW Forward Execution Time (us) : 2598.717 OLD Forward Execution Time (us) : 52.063 NEW Forward Execution Time (us) : 7.904 OLD Forward Execution Time (us) : 52.275 NEW Forward Execution Time (us) : 8.118 OLD Forward Execution Time (us) : 51.896 NEW Forward Execution Time (us) : 7.938 OLD Forward Execution Time (us) : 51.745 NEW Forward Execution Time (us) : 7.922 OLD Forward Execution Time (us) : 52.575 NEW Forward Execution Time (us) : 13.299 OLD Forward Execution Time (us) : 52.090 NEW Forward Execution Time (us) : 8.015 ``` Pull Request resolved: pytorch#74129 Approved by: https://github.com/ngimel
jithunnair-amd
pushed a commit
to jithunnair-amd/pytorch
that referenced
this pull request
Sep 20, 2022
revert d5ca53c (pytorch#46097). The changes only affect ROCm. Reverts a work-around for a compiler performance issue that is no longer needed. `python -m pt.cat_test --tag_filter all --device cuda` ``` OLD Forward Execution Time (us) : 48.833 NEW Forward Execution Time (us) : 8.318 OLD Forward Execution Time (us) : 54.508 NEW Forward Execution Time (us) : 23.824 OLD Forward Execution Time (us) : 52.117 NEW Forward Execution Time (us) : 14.942 OLD Forward Execution Time (us) : 98.790 NEW Forward Execution Time (us) : 74.334 OLD Forward Execution Time (us) : 102.063 NEW Forward Execution Time (us) : 76.008 OLD Forward Execution Time (us) : 167.786 NEW Forward Execution Time (us) : 123.679 OLD Forward Execution Time (us) : 98.320 NEW Forward Execution Time (us) : 67.436 OLD Forward Execution Time (us) : 91.484 NEW Forward Execution Time (us) : 59.230 OLD Forward Execution Time (us) : 109.569 NEW Forward Execution Time (us) : 76.557 OLD Forward Execution Time (us) : 106.603 NEW Forward Execution Time (us) : 87.635 OLD Forward Execution Time (us) : 106.693 NEW Forward Execution Time (us) : 88.902 OLD Forward Execution Time (us) : 110.881 NEW Forward Execution Time (us) : 94.361 OLD Forward Execution Time (us) : 122.925 NEW Forward Execution Time (us) : 123.046 OLD Forward Execution Time (us) : 272.442 NEW Forward Execution Time (us) : 271.932 OLD Forward Execution Time (us) : 457.329 NEW Forward Execution Time (us) : 456.767 OLD Forward Execution Time (us) : 117.688 NEW Forward Execution Time (us) : 87.133 OLD Forward Execution Time (us) : 873.764 NEW Forward Execution Time (us) : 865.075 OLD Forward Execution Time (us) : 1746.831 NEW Forward Execution Time (us) : 1730.252 OLD Forward Execution Time (us) : 2619.303 NEW Forward Execution Time (us) : 2598.717 OLD Forward Execution Time (us) : 52.063 NEW Forward Execution Time (us) : 7.904 OLD Forward Execution Time (us) : 52.275 NEW Forward Execution Time (us) : 8.118 OLD Forward Execution Time (us) : 51.896 NEW Forward Execution Time (us) : 7.938 OLD Forward Execution Time (us) : 51.745 NEW Forward Execution Time (us) : 7.922 OLD Forward Execution Time (us) : 52.575 NEW Forward Execution Time (us) : 13.299 OLD Forward Execution Time (us) : 52.090 NEW Forward Execution Time (us) : 8.015 ``` Pull Request resolved: pytorch#74129 Approved by: https://github.com/ngimel
jithunnair-amd
pushed a commit
that referenced
this pull request
Sep 28, 2022
revert d5ca53c (pytorch#46097). The changes only affect ROCm. Reverts a work-around for a compiler performance issue that is no longer needed. `python -m pt.cat_test --tag_filter all --device cuda` ``` OLD Forward Execution Time (us) : 48.833 NEW Forward Execution Time (us) : 8.318 OLD Forward Execution Time (us) : 54.508 NEW Forward Execution Time (us) : 23.824 OLD Forward Execution Time (us) : 52.117 NEW Forward Execution Time (us) : 14.942 OLD Forward Execution Time (us) : 98.790 NEW Forward Execution Time (us) : 74.334 OLD Forward Execution Time (us) : 102.063 NEW Forward Execution Time (us) : 76.008 OLD Forward Execution Time (us) : 167.786 NEW Forward Execution Time (us) : 123.679 OLD Forward Execution Time (us) : 98.320 NEW Forward Execution Time (us) : 67.436 OLD Forward Execution Time (us) : 91.484 NEW Forward Execution Time (us) : 59.230 OLD Forward Execution Time (us) : 109.569 NEW Forward Execution Time (us) : 76.557 OLD Forward Execution Time (us) : 106.603 NEW Forward Execution Time (us) : 87.635 OLD Forward Execution Time (us) : 106.693 NEW Forward Execution Time (us) : 88.902 OLD Forward Execution Time (us) : 110.881 NEW Forward Execution Time (us) : 94.361 OLD Forward Execution Time (us) : 122.925 NEW Forward Execution Time (us) : 123.046 OLD Forward Execution Time (us) : 272.442 NEW Forward Execution Time (us) : 271.932 OLD Forward Execution Time (us) : 457.329 NEW Forward Execution Time (us) : 456.767 OLD Forward Execution Time (us) : 117.688 NEW Forward Execution Time (us) : 87.133 OLD Forward Execution Time (us) : 873.764 NEW Forward Execution Time (us) : 865.075 OLD Forward Execution Time (us) : 1746.831 NEW Forward Execution Time (us) : 1730.252 OLD Forward Execution Time (us) : 2619.303 NEW Forward Execution Time (us) : 2598.717 OLD Forward Execution Time (us) : 52.063 NEW Forward Execution Time (us) : 7.904 OLD Forward Execution Time (us) : 52.275 NEW Forward Execution Time (us) : 8.118 OLD Forward Execution Time (us) : 51.896 NEW Forward Execution Time (us) : 7.938 OLD Forward Execution Time (us) : 51.745 NEW Forward Execution Time (us) : 7.922 OLD Forward Execution Time (us) : 52.575 NEW Forward Execution Time (us) : 13.299 OLD Forward Execution Time (us) : 52.090 NEW Forward Execution Time (us) : 8.015 ``` Pull Request resolved: pytorch#74129 Approved by: https://github.com/ngimel
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
revert d5ca53c (pytorch#46097). The changes only affect ROCm. Reverts a work-around for a compiler performance issue that is no longer needed.
python -m pt.cat_test --tag_filter all --device cuda
Pull Request resolved: pytorch#74129
Approved by: https://github.com/ngimel
Fixes #ISSUE_NUMBER