Skip to content

simplify meta_init (rope embeddings) #110

@lessw2020

Description

@lessw2020

Currently we cannot use the simpler with torch.device('meta') and a lambda for FSDP param_init_fn. Rather we have a pre and post handler to move the buffers and modules
Depending on the variation, you will hit a couple different errors but for the error currently blocking the simple meta init, the error is:

NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

This looks like we need to implement the equivalent of reset_params() for the rope embeddings class.
Making an issue to resolve this so we can get the working meta init landed and then come back to improve it by resolving this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions