Skip to content

ci: Limit scope of unittest to one python version #5479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

seemethere
Copy link
Member

Limits scope of unittesting to one python version for both macOS and
Windows. These types of workflows are particularly expensive and take a
long time so running them on every PR / every push is a bit wasteful
considering the value in signal between different python versions is
probably negligible.

Similar to pytorch/audio#2256

Context: pytorch/audio#2254

Signed-off-by: Eli Uriegas [email protected]

Limits scope of unittesting to one python version for both macOS and
Windows. These types of workflows are particularly expensive and take a
long time so running them on every PR / every push is a bit wasteful
considering the value in signal between different python versions is
probably negligible.

Signed-off-by: Eli Uriegas <[email protected]>
@facebook-github-bot
Copy link

facebook-github-bot commented Feb 24, 2022

💊 CI failures summary and remediations

As of commit bc12c36 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

1 failure not recognized by patterns:

Job Step Action
CircleCI cmake_macos_cpu curl -o conda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
sh conda.sh -b
source $HOME/miniconda3/bin/activate
conda install -yq conda-build cmake
packaging/build_cmake.sh
🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seemethere Thanks for the PR.

Though I agree that cost saving is very important, the proposed changes seem very restrictive. For example we just lost the windows GPU unit-tests, was this accidental? Moreover we lost the ability to check for issues caused by dependencies for specific OS and Python version combinations. Such issues are extremely common at the moment (see for example this from yesterday: #5451).

I know that your team has on their roadmap a project to standardize and harden the CI infra cross domains. I'm certain that this will help us reduce the breakages. Shall we review the unit-test jobs once this is completed? Happy to chat offline and summarize here later if that's easier. Thanks!

@seemethere
Copy link
Member Author

@seemethere Thanks for the PR.

Though I agree that cost saving is very important, the proposed changes seem very restrictive. For example we just lost the windows GPU unit-tests, was this accidental? Moreover we lost the ability to check for issues caused by dependencies for specific OS and Python version combinations. Such issues are extremely common at the moment (see for example this from yesterday: #5451).

I know that your team has on their roadmap a project to standardize and harden the CI infra cross domains. I'm certain that this will help us reduce the breakages. Shall we review the unit-test jobs once this is completed? Happy to chat offline and summarize here later if that's easier. Thanks!

Losing windows GPU unittests is unintentional, no idea how this PR actually led to that.

As for identifying regressions between OS and specific python versions, we've found that most of those can be covered by the binary tests and don't need to be done in jobs that can run for 2+ hours (or at least those were the numbers I observed for torchaudio). As well, for pytorch core we don't typically run unittests on all versions of python specifically for the fact that we don't typically find a difference in signal between different versions of python.

I'll see about re-adding back the windows GPU tests

@datumbox
Copy link
Contributor

As for identifying regressions between OS and specific python versions, we've found that most of those can be covered by the binary tests

I think the keyword here is "most". The binary tests will indeed capture dependency issues related to conflicts. Unfortunately they won't capture changes in the behaviour such as the ones we observe often in vision. Perhaps the solution for this is to restructure the tests and run in different os/py/cu configurations only those that can get affected. This will reduce the cost without reducing the quality of the checks that we run.

We are happy to consider adding this as a project on our roadmap once the standardization of the infra is done across domains. Doing this prematurely before fixing our CI and infra code will put us to unnecessary risks.

@bigfootjon
Copy link
Member

Hey @datumbox,

pytorch/vision is now the number one CircleCI user across all of of Meta, beating out pytorch/pytorch by a bit of a margin. Is it really necessary all of these matrices of tests need to run on every single PR? This is an expensive configuration and I agree with @seemethere that this is wasteful.

We are happy to consider adding this as a project on our roadmap once the standardization of the infra is done across domains. Doing this prematurely before fixing our CI and infra code will put us to unnecessary risks.

As I understand this PR the tests will still be run on main, so the test coverage is still there right?

Also, the standardization is not going to happen until H2 2022 at the earliest. There's a lot of wasted resources between now and then if this PR isn't merged.

@seemethere seemethere changed the title ci Limit scope of unittest to one python version ci: Limit scope of unittest to one python version Mar 9, 2022
@datumbox
Copy link
Contributor

datumbox commented Mar 9, 2022

@bigfootjon Thanks for the input.

As you can see at #5574 (merged less than an hour ago) we are actively working to improve the performance of our CI and reduce the duration of our tests. I don't know how fair the comparison between TorchVision and PyTorch core is, given that core has moved a large number of its tests on Github Actions and is still FBcode-first. TorchVision is Github-first and uses almost exclusively CircleCI.

I'm a bit surprised that the value of the matrix test is questioned given that @seemethere's team has helped us time and time again to resolve weird bugs that popup on a single platform/CUDA combination. The binary jobs don't often catch all the issues caused by dependencies, hence running these tests allow us to catch issues in time.

Concerning the delay on the standardization work, my understanding was that you were looking into doing this in 2022H1. It's a pity if it's delayed by so much as indeed our combined efforts can help reduce the costs. At any case, as part of TorchVision's 2022H1 roadmap, we plan to improve our model testing infrastructure and this will allow us to reduce the costs. TorchVision has over 130 pretrained model variants (some of which are pretty big and thus require beefy machines), which contributes to increased durations. After this work is concluded, we will be able to selectively run across all matrices only those tests that can get affected by the configuration which will hopefully bring down the cost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants