Skip to content

[references/classification] Adding gradient clipping #4824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 2, 2021
Merged

[references/classification] Adding gradient clipping #4824

merged 5 commits into from
Nov 2, 2021

Conversation

yiwen-song
Copy link
Contributor

@yiwen-song yiwen-song commented Nov 2, 2021

Gradient Clipping can be useful in training - it is a technique to prevent exploding gradients in very deep networks.
Pytorch has a related API torch.nn.utils.clip_grad_norm_ which can be utilized here.

By applying gradient clipping to training, we significantly improved the training accuracy of our modes. For example:

Job ID Model Epochs Nodes Batch Size Per GPU Global Batch Size Acc@1 Acc@5
before gradient clipping 2420 vit_b_16 300 2 256 4096 68.148 87.648
after gradient clipping 2512 vit_b_16 300 2 256 4096 71.876 89.784

cc @kazhang @datumbox

@facebook-github-bot
Copy link

facebook-github-bot commented Nov 2, 2021

💊 CI failures summary and remediations

As of commit 330c403 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@yiwen-song yiwen-song requested a review from datumbox November 2, 2021 00:21
Copy link
Contributor

@kazhang kazhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@yiwen-song yiwen-song merged commit 0817f7f into pytorch:main Nov 2, 2021
@github-actions
Copy link

github-actions bot commented Nov 2, 2021

Hey @sallysyw!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

facebook-github-bot pushed a commit that referenced this pull request Nov 8, 2021
Summary:
* [references] Adding gradient clipping

* ufmt formatting

* remove apex code

* resolve naming issue

Reviewed By: kazhang

Differential Revision: D32216659

fbshipit-source-id: 9c5ffb102fa5fd9861ae5ba0c44052920c34ebaf
cyyever pushed a commit to cyyever/vision that referenced this pull request Nov 16, 2021
* [references] Adding gradient clipping

* ufmt formatting

* remove apex code

* resolve naming issue
@yiwen-song yiwen-song linked an issue Dec 10, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding Vision Transformer to torchvision/models
4 participants