Skip to content

Add more NVFuser microbenchmarks #801

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

davidberard98
Copy link
Contributor

@davidberard98 davidberard98 commented Mar 16, 2022

Stack from ghstack:

Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

Differential Revision: D35732497

Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Mar 16, 2022
Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

ghstack-source-id: 67cd76b
Pull Request resolved: #801
Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Mar 16, 2022
Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

ghstack-source-id: 00f798f
Pull Request resolved: #801
@davidberard98 davidberard98 marked this pull request as draft March 16, 2022 00:23
Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Mar 16, 2022
Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

ghstack-source-id: 2e5074d
Pull Request resolved: #801
Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 5, 2022
Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

ghstack-source-id: a3028c6
Pull Request resolved: #801
Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything blocking this ?

@davidberard98
Copy link
Contributor Author

@eellison they are still erroring pytorch/pytorch#75282

Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 15, 2022
Waiting on pytorch/pytorch#73627 to land, because some of these don't pass without it.

ghstack-source-id: 13d2a15
Pull Request resolved: #801
@davidberard98 davidberard98 marked this pull request as ready for review April 18, 2022 20:11
@davidberard98 davidberard98 changed the title [WIP] Add more NVFuser microbenchmarks Add more NVFuser microbenchmarks Apr 18, 2022
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

davidberard98 added a commit to pytorch/pytorch that referenced this pull request Apr 20, 2022
…ils"


[NVFuser] always fallback if fusion fails

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

[ghstack-poisoned]
davidberard98 added a commit to pytorch/pytorch that referenced this pull request Apr 20, 2022
[NVFuser] always fallback if fusion fails

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

[ghstack-poisoned]
davidberard98 added a commit to pytorch/pytorch that referenced this pull request Apr 20, 2022
1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

ghstack-source-id: 60c31f7
Pull Request resolved: #75983
davidberard98 added a commit to pytorch/pytorch that referenced this pull request Apr 21, 2022
…ils"


[NVFuser] always fallback if fusion fails

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

[ghstack-poisoned]
davidberard98 added a commit to pytorch/pytorch that referenced this pull request Apr 21, 2022
[NVFuser] always fallback if fusion fails

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

[ghstack-poisoned]
davidberard98 added a commit to pytorch/pytorch that referenced this pull request Apr 21, 2022
1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

ghstack-source-id: 59be971
Pull Request resolved: #75983
@facebook-github-bot facebook-github-bot deleted the gh/davidberard98/2/head branch April 23, 2022 14:15
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Apr 25, 2022
1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

Pull Request resolved: #75983

Approved by: https://github.com/jjsjann123
davidberard98 added a commit to davidberard98/pytorch that referenced this pull request Apr 28, 2022
Retry of pytorch#75983. The change is to handle cases where attr::cache_id is
not set. This can happen if compilation fails.

Original message:

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Apr 28, 2022
Retry of #75983. The change is to handle cases where attr::cache_id is
not set. This can happen if compilation fails.

Original message:

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

Pull Request resolved: #76505
Approved by: https://github.com/eellison
facebook-github-bot pushed a commit to pytorch/pytorch that referenced this pull request Apr 30, 2022
Summary:
Retry of #75983. The change is to handle cases where attr::cache_id is
not set. This can happen if compilation fails.

Original message:

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

Pull Request resolved: #76505
Approved by: https://github.com/eellison

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/e52dc9888bd7e30e467bd7ae729791885ec43f58

Reviewed By: osalpekar

Differential Revision: D36042346

Pulled By: davidberard98

fbshipit-source-id: 7f34a0ae65f9583b8390383400fd91f69c635fc8
jjsjann123 pushed a commit to jjsjann123/nvfuser that referenced this pull request Oct 29, 2022
1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

Pull Request resolved: pytorch/pytorch#75983

Approved by: https://github.com/jjsjann123
jjsjann123 pushed a commit to jjsjann123/nvfuser that referenced this pull request Oct 29, 2022
Retry of #75983. The change is to handle cases where attr::cache_id is
not set. This can happen if compilation fails.

Original message:

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

Pull Request resolved: pytorch/pytorch#76505
Approved by: https://github.com/eellison
jjsjann123 pushed a commit to jjsjann123/nvfuser that referenced this pull request Nov 10, 2022
1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

Pull Request resolved: pytorch/pytorch#75983

Approved by: https://github.com/jjsjann123
jjsjann123 pushed a commit to jjsjann123/nvfuser that referenced this pull request Nov 10, 2022
Retry of #75983. The change is to handle cases where attr::cache_id is
not set. This can happen if compilation fails.

Original message:

1) remember when fusions fail; and on subsequent runs, always take the fallback.
2) during the first fallback, cache the Code object.

On autogen-69 from the nvfuser microbenchmarks (pytorch/benchmark#801) this improved performanance as follows:
* Original (always attempt fusion): 25ms
* Always take fallback after first failure: 0.79ms
* Always take fallback + cache Code object: 0.62ms
* Eager: 0.58ms

Pull Request resolved: pytorch/pytorch#76505
Approved by: https://github.com/eellison
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants