-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Reduce unnecessary cuda sync in anchor_utils.py
#5515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 582eba6 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
anchor_utils.py
I'm only listed in blame due to #4384. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK switching to the empty().fill_()
idiom given the analysis. It's something we use in other places on TorchVision anyway:
vision/torchvision/ops/drop_block.py
Lines 41 to 42 in 5568744
noise = torch.empty((N, C, H - block_size + 1, W - block_size + 1), dtype=input.dtype, device=input.device) | |
noise.bernoulli_(gamma) |
Hey @datumbox! You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py |
Summary: Co-authored-by: Vasilis Vryniotis <[email protected]> Reviewed By: vmoens Differential Revision: D34879001 fbshipit-source-id: 5830dec79b5f80fa20b55862c84906a04898aa80
This PR attempts to reduce unnecessary CUDA sync in maskrcnn model code, trying to improve its performance.
We evaluate the performance impact of this PR with the vision_maskrcnn model in TorchBench.
Before the patch:
We observe 3 CUDA sync events in
anchor_utils.py
:After the patch:
Although there is no obvious change in the runtime, we observe only 1 CUDA sync event from
anchor_utils.py
:So we still believe it is a good improvement.