You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
currently, planning to use a 'seed checkpoint' to initialize the
pipeline parallel model chunks after moving them from meta device to
cuda/empty.
non-persistent buffers are incompatible with this approach, as they are
missing from the checkpoint and thus require manual init.
an alternative is to manually run the initializer for just the
non-persistent buffers after loading a seed-checkpoint, but this
approach is nearly equivalent with less code changes.
ghstack-source-id: b482284
Pull Request resolved: #201
0 commit comments