Open
Description
Hello, I was wondering whether it would be possible to have a small example of code where a same network is cloned on different GPUs, with all clones sharing the same parameters.
For instance, I would like something where different subprocesses can train the model separately (like 8 subprocesses, each responsible for training a model on one GPU). The updates could then be accumulated to a common network, and all GPU network clones could synchronize their parameters to the ones of the common network periodically, or something like this.