-
Notifications
You must be signed in to change notification settings - Fork 11.9k
Add phi 3 chat template #6857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ggerganov
merged 2 commits into
ggml-org:master
from
tristandruyen:add-phi3-chat-template
Apr 24, 2024
Merged
Add phi 3 chat template #6857
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tristandruyen This template uses a space character
' '
after each special token, but on your cpp version, you used new line"\n"
. Can you check if it is correct?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the Chat Format documented in the huggingface README's there should be a
\n
after<|user|>
and<|end|>
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/README.md#chat-format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While they do not explicitly state that there should be a
\n
after<|assistant|>
the model nearly always generates a\n
as a first token when it is left out...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best to look at the code directly rather than README. Sometimes human copy-paste can have mistakes.
On the same repo that you linked to, you can see the jinja template code here:
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/tokenizer_config.json#L119
This file have
'<|user|>' + '\n'
which confirms that there should be a new line after special tokens.However, that does not explain why your jinja version does not have it. Remember, jinja templates are deterministic, meaning your cpp code need to follow 100% what is being coded inside the template. This mean we can fix this PR in one of 2 ways: (1) correct the jinja template to have
\n
or (2) correct the cpp implementation output' '
instead of new line.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems there are even more version of the template, I guess i had an outdated tokeinzer_config.json when making my GGUF's, because I just took the jinja template from the gguf metadata of my recent phi3 quants, which should just be a copy of what was in the tokenizer_config.json, shouldn't it ?I think I found the reason, the huggingface GUI for displaying GGUF metadata, seems to replace \n with whitespace for some weird reason....
I'll update the template with the correct one....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: there are indeed 2 versions, but they all use new line:
'<|' + message['role'] + '|>' + '\n'
: https://huggingface.co/microsoft/Phi-3-small-128k-instruct/blob/main/tokenizer_config.json#L10<|user|>
: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/tokenizer_config.json#L119I think it's best to somehow scan all template in existing phi-3 family, then make a fix PR. Without careful research, I doubt that there will be more confusions in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd agree that there is a bad design in the jinja templates (in general). Since
'\n'
is wrapped inside a json string, it should be converted to" '\\n' "
. But for some reasons, jinja doesn't care about this mistake (that's why it get converted to space in the gguf viewer on huggingface).Unfortunately, this is not something we could fix on our side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the PR :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tristandruyen Please have a look of complete list of phi-3 templates: https://gist.github.com/ngxson/38d8d89fb5d856187bcc70e78b1cc4f5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the list.
So it seems to me like these are 4 different ways to get basically the same result.
There are some difference but AFAIK they should not matter here:
tokenizer.ggml.add_bos_token
So the current c++ code seems to do the correct thing in my understanding.
I think adding all variants as test would be good, the current matching via
<|assistant|>
and<|end|>
should work for all of them.