-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Support complicated use cases with TiedLayerSpec #7208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Extend the builtin `getattr` to a recursive version `PipelineModule._recursive_getattr` for nested tied weights, e.g., "linear.weight". Meanwhile, sort tie_keys in `PipelineModule._index_tied_modules` to avoid hanging. Signed-off-by: Mingjie Li <[email protected]>
Thank you @limjcst for the contribution! This is a significant improvement. |
nv-accelerate-v100 failed, raising "invalid command 'bdist_wheel'". However, this job succeeded in another run. Note that the failed job used "cached wheel-0.46.1-py3-none-any.whl.metadata". |
Thanks @limjcst - I saw this failure on another PR and will take a look and merge the fixes into your PR when ready. |
@limjcst - it looks like the |
Nevertheless, upgrading to |
@agronholm - yes, we have a PR for one here which we will prioritize merging as we know this is needed. |
I want to reuse a composed module in the pipeline. For example, the following
MyModule
has a memberlinear
, which is also a module.MyModule.linear.weight
should be synchronized among related ranks. As a result, I addlinear.weight
toTiedLayerSpec.tied_weight_attr
.BTW, I generate the whole
tied_weight_attr
by the following instruction.However, the builtin
getattr
used byPipelineModule
fails to find a nested attribute likelinear.weight
.Hence, this PR first extends the builtin
getattr
to a recursive versionPipelineModule._recursive_getattr
, accessing each attribute segment one by one.Meanwhile, the order of tied weights matters in synchronization. This PR suggests to sort tie_keys in
PipelineModule._index_tied_modules
to avoid hanging.