Skip to content

enhance partition utilities #1191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 5, 2020
Merged

Conversation

wyli
Copy link
Contributor

@wyli wyli commented Nov 4, 2020

enhances the partition utilities, part of #816

Status

Ready

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Quick tests passed locally by running ./runtests.sh --quick.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

@wyli wyli requested review from ericspod and Nic-Ma November 4, 2020 22:23
@wyli wyli force-pushed the enhance-partition-utils branch from 86688f0 to e386bba Compare November 4, 2020 22:24
@wyli
Copy link
Contributor Author

wyli commented Nov 4, 2020

addressed some corner cases, not sure whether we should use np.random.RandomState instead of seed for shuffle

@wyli wyli force-pushed the enhance-partition-utils branch from e386bba to 998daf6 Compare November 4, 2020 23:11
@Nic-Ma
Copy link
Contributor

Nic-Ma commented Nov 4, 2020

addressed some corner cases, not sure whether we should use np.random.RandomState instead of seed for shuffle

Hi @wyli ,

Current shuffle logic can work fine cross multiple processes, I am not sure whether np.random.RandomState also work well?
We are using this API to partition dataset for CacheDataset and SmartCacheDataset in distributed training of Clara.

Thanks.

@wyli
Copy link
Contributor Author

wyli commented Nov 5, 2020

addressed some corner cases, not sure whether we should use np.random.RandomState instead of seed for shuffle

Hi @wyli ,

Current shuffle logic can work fine cross multiple processes, I am not sure whether np.random.RandomState also work well?

Thanks.

ok, I was not aware of that, but since each process has a different partition, I guess it's fine to share the randomstate instance? would be great to create a few test cases so that I can understand the context. anyway we shouldn't change the global seed here.

@Nic-Ma
Copy link
Contributor

Nic-Ma commented Nov 5, 2020

Hi @wyli ,

The multiple process usage is something like this: https://github.com/Project-MONAI/tutorials/blob/master/acceleration/distributed_training/unet_training_smartcache.py#L176
I am not sure whether all the processes can get the exact same shuffle result with np.random.RandomState?
Could you please help double confirm it? If yes, others look good to me, we can merge your PR immediately.
Then I can create a PR to update the examples of distributed training.

Thanks.

@wyli
Copy link
Contributor Author

wyli commented Nov 5, 2020

Hi @wyli ,

The multiple process usage is something like this: https://github.com/Project-MONAI/tutorials/blob/master/acceleration/distributed_training/unet_training_smartcache.py#L176
I am not sure whether all the processes can get the exact same shuffle result with np.random.RandomState?
Could you please help double confirm it? If yes, others look good to me, we can merge your PR immediately.
Then I can create a PR to update the examples of distributed training.

Thanks.

i don’t see any issue

@wyli wyli merged commit 62a2164 into Project-MONAI:master Nov 5, 2020
@wyli wyli deleted the enhance-partition-utils branch April 12, 2021 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants