[WIP] Apply SuperBlock to Llama #1047

mostafaelhoushi · 2024-10-10T02:01:25Z

Still work in progress.
To run:

cd torchao/_models/llama

python generate.py --checkpoint_path ${CHECKPOINT_PATH}/model.pth --superblock

pytorch-bot · 2024-10-10T02:01:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1047

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jcaip · 2024-10-10T14:50:48Z

torchao/sparsity/prototype/superblock/utils.py

@@ -120,7 +120,7 @@ def mlp_only(mod, name):


 def superblock_only(mod, name):
-    return isinstance(mod, SupermaskLinear) and "mlp" in name
+    return isinstance(mod, SupermaskLinear)# and "mlp" in name


@mostafaelhoushi Should this be changed to SupermaskReplacementClass?

hmmm... SupermaskReplacementClass constructor requires a lot of arguments like linear_sparsity, and linear_sp_tilesize, etc. How will we pass them here?

I think I need to do some more refactoring:

the ViT benchmark code assumes that there is a model checkpoint trained with SuperBlock, and hence has SupermaskLinear layers and parameters

the GPT-Fast code I wrote did a hack in which it converted Linear layers to SupermaskLinear layers then applied BSR sparsification.

Ah, I think I read your changes wrong, I was assuming you created a SupermaskReplacementClass to combine SupermaskLinear, SupermaskConv, but I still you're still using those under the hood. I think this should be fine actually.

jerryzh168 · 2024-11-21T05:41:11Z

torchao/sparsity/prototype/superblock/README.md

should be torchao/prototype/sparsity ? #1013

I think this is because this PR forked from an old commit that had superblock in torchao/sparsity/prototype.
When finalizing the PR we can rebase on top of main and change the path of the directory.

mostafaelhoushi added 4 commits October 10, 2024 01:47

clean up unnecessary section from README

8b52500

refactor function to replace modules

6f0d04b

remove mlp condition for now

4237c0a

add option to apply superblock to gptfast

Loading
Loading status checks…

0a9bb8a

facebook-github-bot added the CLA Signed label Oct 10, 2024

jcaip reviewed Oct 10, 2024

View reviewed changes

apply mask in STE manner

Loading
Loading status checks…

252402b

mostafaelhoushi force-pushed the superblock-gpt-fast branch from c9738ac to 252402b Compare October 10, 2024 17:20

mostafaelhoushi added 3 commits November 21, 2024 04:37

add torchtune script and instructions

Loading
Loading status checks…

7dad328

add option too not apply supermask on attention layers

b605649

support more than 2 dimensions for input

Loading
Loading status checks…

7f7bd85

mostafaelhoushi force-pushed the superblock-gpt-fast branch from 5c59c94 to 7f7bd85 Compare November 21, 2024 04:45

jerryzh168 reviewed Nov 21, 2024

View reviewed changes

mostafaelhoushi added 3 commits November 23, 2024 03:06

use Llama3.2 1B as default model

d6ed541

temporary changes to make superblock to work

f26658f

change checkpointer output_dir

Loading
Loading status checks…

a7ebb1e

mostafaelhoushi changed the title ~~[WIP] Apply SuperBlock to GPT-Fast~~ [WIP] Apply SuperBlock to Llama Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Apply SuperBlock to Llama #1047

[WIP] Apply SuperBlock to Llama #1047

mostafaelhoushi commented Oct 10, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 10, 2024 •

edited

Loading

Uh oh!

jcaip Oct 10, 2024

Uh oh!

mostafaelhoushi Oct 10, 2024

Uh oh!

mostafaelhoushi Oct 10, 2024

Uh oh!

jcaip Oct 10, 2024

Uh oh!

jerryzh168 Nov 21, 2024 •

edited

Loading

Uh oh!

mostafaelhoushi Nov 21, 2024

Uh oh!

[WIP] Apply SuperBlock to Llama #1047

Are you sure you want to change the base?

[WIP] Apply SuperBlock to Llama #1047

Conversation

mostafaelhoushi commented Oct 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1047

Uh oh!

jcaip Oct 10, 2024

Choose a reason for hiding this comment

Uh oh!

mostafaelhoushi Oct 10, 2024

Choose a reason for hiding this comment

Uh oh!

mostafaelhoushi Oct 10, 2024

Choose a reason for hiding this comment

Uh oh!

jcaip Oct 10, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jerryzh168 Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mostafaelhoushi Nov 21, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mostafaelhoushi commented Oct 10, 2024 •

edited

Loading

pytorch-bot bot commented Oct 10, 2024 •

edited

Loading

jerryzh168 Nov 21, 2024 •

edited

Loading