Torchao opt resuming from ckpt requires `weights_only=False`

In torchtune, cant resume from checkpoint when using torchao:

```
  File "/data/users/felipemello/torchtune/torchtune/training/checkpointing/_utils.py", line 249, in safe_torch_load
    state_dict = torch.load(
                 ^^^^^^^^^^^
  File "/home/felipemello/.conda/envs/torchtune/lib/python3.11/site-packages/torch/serialization.py", line 1486, in load
    raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint. 
        (1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
        (2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message.
        WeightsUnpickler error: Unsupported global: GLOBAL torchao.prototype.low_bit_optim.subclass_8bit.OptimState8bit was not an allowed global by default. Please use `torch.serialization.add_safe_globals([OptimState8bit])` or the `torch.serialization.safe_globals([OptimState8bit])` context manager to allowlist this global if you trust this class/function.
```

to reproduce:
```
tune download meta-llama/Llama-3.2-1B-Instruct --output-dir /tmp/Llama-3.2-1B-Instruct --ignore-patterns "original/consolidated.00.pth"
```

```
tune run full_finetune_single_device --config llama3_2/1B_full_single_device epochs=2 max_steps_per_epoch=20 optimizer=torchao.prototype.low_bit_optim.AdamW8bit
```

```
tune run full_finetune_single_device --config llama3_2/1B_full_single_device epochs=2 max_steps_per_epoch=20 optimizer=torchao.prototype.low_bit_optim.AdamW8bit resume_from_checkpoint=True checkpointer.checkpoint_files=["epoch_0/model-00001-of-00001.safetensors"] 
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Torchao opt resuming from ckpt requires `weights_only=False` #1885

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Torchao opt resuming from ckpt requires weights_only=False #1885

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Torchao opt resuming from ckpt requires `weights_only=False` #1885