Export TorchTune llama3_2_vision in ET #5911

jackzhxng · 2024-10-05T03:40:31Z

Summary

Add llama3_2_vision's text decoder as a TorchTune-exportable model.

KV cache (commented out) and quantization not supported yet.

Test plan

Tested with the next PR in this chain and the instructions in its description: #6610

PR chain:

pytorch-bot · 2024-10-05T03:40:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5911

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

✅ No Failures

As of commit 9777e23 with merge base 4e83f24 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519 PR chain: - **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Exported Stories110M model. ``` wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv ``` Differential Revision: D64027696 Pulled By: dvorjackz

Summary: For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519 PR chain: - **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Exported Stories110M model. ``` wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv ``` Reviewed By: tarun292 Differential Revision: D64027696 Pulled By: dvorjackz

Summary: For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519 PR chain: - **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Pull Request resolved: #5765 Test Plan: Exported Stories110M model. ``` wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv ``` Reviewed By: tarun292 Differential Revision: D64027696 Pulled By: dvorjackz fbshipit-source-id: 15ecfb458c6194159140d4c601e5443a2e524fdc

Summary: - Removes redundant steps in the Llama2 export - Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal) - Comments and orders code more clearly PR chain: - [Add kwarg example inputs to eager model base](#5765) - **YOU ARE HERE ~>** [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Ensure export + eval is similar before and after for Stories 110M: ``` python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000 ``` Before: ``` wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` After: ``` wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` Differential Revision: D64145852 Pulled By: dvorjackz

Summary: - Removes redundant steps in the Llama2 export - Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal) - Comments and orders code more clearly PR chain: - [Add kwarg example inputs to eager model base](#5765) - **YOU ARE HERE ~>** [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Ensure export + eval is similar before and after for Stories 110M: ``` python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000 ``` Before: ``` wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` After: ``` wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` Reviewed By: dbort Differential Revision: D64145852 Pulled By: dvorjackz

Summary: Specify model to export in the CLI. Test Plan: Exported the stories 110M model. ``` python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv ``` PR chain: - [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Runner changes for TorchTune Llama3.2 vision text decoder](#6610) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Differential Revision: D65612837 Pulled By: dvorjackz

Summary: Specify model to export in the CLI. Test Plan: Exported the stories 110M model. ``` python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv ``` PR chain: - [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Runner changes for TorchTune Llama3.2 vision text decoder](#6610) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Reviewed By: helunwencser Differential Revision: D65612837 Pulled By: dvorjackz

examples/models/llama3_2_vision/model.py

tarun292 · 2024-11-13T23:06:04Z

examples/models/llama3_2_vision/model.py

+
+            self.model_ = prune_output_vocab(self.model_, output_prune_map)
+
+        # if self.use_kv_cache:


Will be uncommented in #6643

tarun292 · 2024-11-13T23:17:47Z

examples/models/llama/export_llama_lib.py

@@ -951,7 +954,9 @@ def _load_llama_model(
            use_kv_cache,
            use_sdpa_with_kv_cache,
            enable_dynamic_shape,
-            model.params,
+            model.max_seq_len,


Previously all the models were getting it from model.params. Is it guaranteed that all models will have these available on them as attributes directly? Hopefully CI catches if they don't.

Yeah, hopefully CI catches but should be okay since there's only two model.pys using this export_llama_lib atm and both have it defined

larryliu0820 · 2024-11-14T01:33:13Z

Also can you add it to gather_test_models.py. Once added it automatically adds 2 jobs for macos and linux to test it.

facebook-github-bot · 2024-11-14T18:44:54Z

@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-11-14T20:52:44Z

@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 5, 2024

jackzhxng changed the base branch from main to jz/tt-llama October 5, 2024 03:40

jackzhxng marked this pull request as draft October 5, 2024 03:41

jackzhxng force-pushed the jz/tt-llama-2 branch from 07b8a4c to 20355d7 Compare October 5, 2024 03:51

jackzhxng force-pushed the jz/tt-llama branch from 60a21c1 to 948a18f Compare October 7, 2024 20:55

jackzhxng force-pushed the jz/tt-llama-2 branch from 20355d7 to 850e7c8 Compare October 7, 2024 20:56

This was referenced Oct 7, 2024

Add kwarg example inputs to eager model base #5765

Closed

Add et version of TorchTune MHA for swapping with custom op #5912

Closed

Accept model type parameter in export_llama #5910

Closed

Llama2 model cleanup #5859

Closed

jackzhxng force-pushed the jz/tt-llama branch from 948a18f to 0c46b2e Compare October 8, 2024 06:46

jackzhxng force-pushed the jz/tt-llama-2 branch from 850e7c8 to e850430 Compare October 8, 2024 06:48

jackzhxng force-pushed the jz/tt-llama branch from 0c46b2e to aa30bca Compare October 8, 2024 07:21

jackzhxng force-pushed the jz/tt-llama-2 branch from e850430 to af45b02 Compare October 8, 2024 07:21

jackzhxng force-pushed the jz/tt-llama branch from aa30bca to 1923d56 Compare October 8, 2024 17:21

jackzhxng force-pushed the jz/tt-llama-2 branch from af45b02 to f9c001a Compare October 8, 2024 17:25

jackzhxng force-pushed the jz/tt-llama branch from 1923d56 to 76e7cc7 Compare October 8, 2024 20:10

jackzhxng force-pushed the jz/tt-llama-2 branch from 03779eb to 4a09ff1 Compare October 8, 2024 20:12

jackzhxng mentioned this pull request Oct 8, 2024

export_for_training regression on Llama3_2_vision text decoder pytorch/pytorch#137540

Closed

jackzhxng mentioned this pull request Oct 9, 2024

[DRAFT] Changes to native runner to run TorchTune Lllama #6075

Closed

Merge branch 'jz/tt-llama-rebased' into jz/tt-llama-2

13d004b

jackzhxng added the release notes: examples Changes to any of our example LLMs integrations, such as Llama3 and Llava label Nov 1, 2024

jackzhxng mentioned this pull request Nov 1, 2024

Runner changes for TorchTune Llama3.2 vision text decoder #6610

Merged

facebook-github-bot force-pushed the jz/tt-llama-rebased branch from dc889b9 to eb49bcf Compare November 11, 2024 20:40

facebook-github-bot force-pushed the jz/tt-llama-rebased branch from eb49bcf to b1d2327 Compare November 12, 2024 04:14

facebook-github-bot force-pushed the jz/tt-llama-rebased branch from b1d2327 to d8bfa6f Compare November 12, 2024 04:15

facebook-github-bot force-pushed the jz/tt-llama-rebased branch from d8bfa6f to dbd9139 Compare November 13, 2024 02:03

Base automatically changed from jz/tt-llama-rebased to main November 13, 2024 03:50

jackzhxng added 2 commits November 13, 2024 06:56

Strict = True

de45c48

Merge branch 'main' into jz/tt-llama-2

2fe7bd8

jackzhxng force-pushed the jz/tt-llama-2 branch from c79b773 to 2fe7bd8 Compare November 13, 2024 15:06

jackzhxng added 2 commits November 13, 2024 07:08

Lint

64dcbda

Fix merge

a89d6b2

jackzhxng marked this pull request as ready for review November 13, 2024 15:46

larryliu0820 reviewed Nov 13, 2024

View reviewed changes

examples/models/llama3_2_vision/model.py Outdated Show resolved Hide resolved

tarun292 reviewed Nov 13, 2024

View reviewed changes

Move to subdir

e5428de

larryliu0820 approved these changes Nov 14, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into jz/tt-llama-2

bf33485

jackzhxng added 2 commits November 14, 2024 12:49

Add automatically generated export tests

7a0101f

Fix internal pyre warning

9777e23

jackzhxng merged commit 27f31cd into main Nov 14, 2024
66 of 68 checks passed

jackzhxng deleted the jz/tt-llama-2 branch November 14, 2024 22:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Export TorchTune llama3_2_vision in ET #5911

Export TorchTune llama3_2_vision in ET #5911

Uh oh!

jackzhxng commented Oct 5, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 5, 2024 •

edited

Loading

Uh oh!

Uh oh!

tarun292 Nov 13, 2024

Uh oh!

jackzhxng Nov 14, 2024

Uh oh!

tarun292 Nov 13, 2024

Uh oh!

jackzhxng Nov 14, 2024

Uh oh!

larryliu0820 commented Nov 14, 2024

Uh oh!

facebook-github-bot commented Nov 14, 2024

Uh oh!

facebook-github-bot commented Nov 14, 2024

Uh oh!

Uh oh!

Uh oh!


		self.model_ = prune_output_vocab(self.model_, output_prune_map)

		# if self.use_kv_cache:

Export TorchTune llama3_2_vision in ET #5911

Export TorchTune llama3_2_vision in ET #5911

Uh oh!

Conversation

jackzhxng commented Oct 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Oct 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5911

❗ 1 Active SEVs

✅ No Failures

Uh oh!

Uh oh!

tarun292 Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

jackzhxng Nov 14, 2024

Choose a reason for hiding this comment

Uh oh!

tarun292 Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

jackzhxng Nov 14, 2024

Choose a reason for hiding this comment

Uh oh!

larryliu0820 commented Nov 14, 2024

Uh oh!

facebook-github-bot commented Nov 14, 2024

Uh oh!

facebook-github-bot commented Nov 14, 2024

Uh oh!

Uh oh!

Uh oh!

jackzhxng commented Oct 5, 2024 •

edited

Loading

pytorch-bot bot commented Oct 5, 2024 •

edited

Loading