Reduce variance of model evaluation in references

There can be a bit of variance in the model evaluation, due to different things (see https://github.com/pytorch/vision/issues/4559, although the timeline can be a bit confusing to follow because I was relying on incorrect assumptions).

We addressed it in https://github.com/pytorch/vision/pull/4609 for the classification reference. We should try doing the same for the rest of the references (detection,   segmentation,  similarity,  video_classification): 

- remove the cudnn auto benchmarking when test-only is True.
- set shuffle=False for the test_dataloader
- Add a `--use-deterministic-algorithms` flag to the scripts
- Add a warning when the number of processed samples in the validation is different from `len(dataset)` (this one might not be relevant for the detection scripts)

Tackling this issue requires access to at least 1 GPU to make sure the new evaluation scores are similar and more stable than the previous ones.

cc @datumbox

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce variance of model evaluation in references #4730

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reduce variance of model evaluation in references #4730

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions