Skip to content

Conversation

CharlieFRuan
Copy link
Member

@CharlieFRuan CharlieFRuan commented Aug 23, 2024

@CharlieFRuan CharlieFRuan marked this pull request as ready for review August 23, 2024 15:55
@CharlieFRuan CharlieFRuan merged commit 055f568 into mlc-ai:main Aug 23, 2024
CharlieFRuan added a commit to mlc-ai/web-llm that referenced this pull request Aug 23, 2024
This PR adds the newly release Phi3.5-mini, adding the following
`model_id`s to our prebuilt model list:
- `Phi-3.5-mini-instruct-q4f16_1-MLC` (4k KVCache)
- `Phi-3.5-mini-instruct-q4f32_1-MLC` (4k KVCache)
- `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache)
- `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache)

See mlc-ai/binary-mlc-llm-libs#136 for on which
commits of TVM and MLC-LLM this is compiled with.

Note that Phi-3.5-mini comes with support up to 128K context (unlike
Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM
supports, which you can take advantage of in WebLLM by increasing
`ModelRecord.overrides.context_window_size` or specifying it in
`ChatOptions` when loading a model, as long as there is enough VRAM.
jingyi-zhao-01 pushed a commit to jingyi-zhao-01/web-llm that referenced this pull request Dec 8, 2024
This PR adds the newly release Phi3.5-mini, adding the following
`model_id`s to our prebuilt model list:
- `Phi-3.5-mini-instruct-q4f16_1-MLC` (4k KVCache)
- `Phi-3.5-mini-instruct-q4f32_1-MLC` (4k KVCache)
- `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache)
- `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache)

See mlc-ai/binary-mlc-llm-libs#136 for on which
commits of TVM and MLC-LLM this is compiled with.

Note that Phi-3.5-mini comes with support up to 128K context (unlike
Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM
supports, which you can take advantage of in WebLLM by increasing
`ModelRecord.overrides.context_window_size` or specifying it in
`ChatOptions` when loading a model, as long as there is enough VRAM.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant