Skip to content

extend DistributedSampler to support group_size #1512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

stephenyan1231
Copy link
Contributor

summary

For video model evaluation, we sample N clips from a video, and average clip predictions to get a video-level prediction.
Assume, we sample 2 clips per video. The test dataset, which has 4 videos {A,B,C,D} is illustrated below.

[A_0, A_1, B_0, B_1, C_0, C_1, D_0, D_1]

Assume we have 2 gpus. The existing DistributedSampler will distribute clips from the same video to different gpus, and make it difficult to average clip predictions.

GPU 0: 

       [A_0, B_0, C_0, D_0]

GPU 1: 

       [A_1, B_1, C_1, D_1]

We extend ShardDataset to support an optional argument group_size. When group_size=2, which will shard clips below.

GPU 0: 

        [A_0, A_1, B_0, B_1]

GPU 1: 

        [C_0, C_1, D_0, D_1]

This facilitates the averaging of clip predictions.

Unit test

python test/test_datasets_samplers.py

@codecov-io
Copy link

codecov-io commented Oct 22, 2019

Codecov Report

Merging #1512 into master will increase coverage by 0.31%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1512      +/-   ##
==========================================
+ Coverage   64.34%   64.66%   +0.31%     
==========================================
  Files          83       83              
  Lines        6454     6461       +7     
  Branches      992      992              
==========================================
+ Hits         4153     4178      +25     
+ Misses       2006     1984      -22     
- Partials      295      299       +4
Impacted Files Coverage Δ
torchvision/datasets/samplers/clip_sampler.py 79.54% <100%> (+23.98%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 937c83a...916801c. Read the comment docs.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot for the PR Zhicheng!

@fmassa fmassa merged commit 355e9d2 into pytorch:master Oct 22, 2019
@fmassa fmassa mentioned this pull request Oct 31, 2019
fmassa pushed a commit that referenced this pull request Oct 31, 2019
* extend DistributedSampler to support group_size

* Fix lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants