-
Notifications
You must be signed in to change notification settings - Fork 11.7k
Implement non-greedy tokenizer that tries to maximize token lengths #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Although I haven't examined the code, I've tested it on several prompts and can already conclude that this patch allows Llama to write in French. |
@@ -846,6 +846,7 @@ int main(int argc, char ** argv) { | |||
std::vector<float> logits; | |||
|
|||
// tokenize the prompt | |||
params.prompt.insert(0, 1, ' '); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the space meant to be a separate token? I noticed that it often get fused with the first user provided token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be fused to the first token! This is how original python llama code parses it.
I can dig out more details if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge it if results look ok.
I won't be able to have detailed look in the next few days
- this is to match original llama tokenizer behavior
…gml-org#242) * Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <[email protected]>
…gml-org#242) * Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <[email protected]>
…gml-org#242) * Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <[email protected]>
No description provided.