Skip to content

Device selection with pytest. #1713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 1, 2020
Merged

Device selection with pytest. #1713

merged 5 commits into from
May 1, 2020

Conversation

gabrieldemarmiesse
Copy link
Member

@gabrieldemarmiesse gabrieldemarmiesse commented Apr 21, 2020

Fixes part of #1682

Note that I left out the idea of using many gpus for now as it's not necessary. One is more than enough.

This focuses on enabling multiprocessing with the gpu available.
This also have another benefit: it doesn't matter if your system has a gpu or not anymore, the same tests will get exectuted on the same devices.

We can now use many workers, even when only one GPU is available to run the tests.

It's compatible with calling manually with tf.device("GPU:0"), but I believe it's cleaner to use this built-in solution so we should replace the leftovers later on.

Also in the TODO, instead of using one virtual device per process, we can make multiple ones, and then do something like @pytest.mark.with_device(["mirror_strategy"]).

@bot-of-gabrieldemarmiesse

@AakashKumarNain

You are owner of some files modified in this pull request.
Would you kindly review the changes whenever you have the time to?
Thank you very much.

Copy link
Member

@seanpmorgan seanpmorgan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good thanks for continuing to build out our pytest suite! I tested locally and got expected results. Small doc change requested.

I also had trouble finding where the op logging was coming from. For example at the end of a test run it'll ouput:

transform/strided_slice_2/stack_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0
transform/strided_slice_2/stack_2: (Const): /job:localhost/replica:0/task:0/device:GPU:0
transform/assert_rank/rank: (Const): /job:localhost/replica:0/task:0/device:GPU:0
transform/assert_rank/Shape: (Const): /job:localhost/replica:0/task:0/device:GPU
...

This seems great for debugging, but it is a bit unwieldy to look through if your testing something other than device placement.

@gabrieldemarmiesse
Copy link
Member Author

Yeah, I totally forgot to remove the logging of the devices. It's done now.

Copy link
Member

@seanpmorgan seanpmorgan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@seanpmorgan seanpmorgan merged commit 36fe9fa into tensorflow:master May 1, 2020
ashutosh1919 pushed a commit to ashutosh1919/addons that referenced this pull request Jul 12, 2020
* The device to execute each test is set in stone.
* Update CONTRIBUTING.md
* Removed device placement logging and added comment.
jrruijli pushed a commit to jrruijli/addons that referenced this pull request Dec 23, 2020
* The device to execute each test is set in stone.
* Update CONTRIBUTING.md
* Removed device placement logging and added comment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants