-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix v1/chat/completions Gibberish API Responses #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@keldenl any hints where one could find an unfiltered vicuna grazing? asking for a friend ... |
hug.. some.. faces? |
Thanks for the contribution I'll try to address this in a more general way with #17 by allowing you to load multiple models and set defaults based on the specific model |
Also, I haven't tested out the vicuna model yet but it looks very promising, I've found using alpaca for chat is less than ideal. |
Vicuña has given me some good results. I've tweaked the chat-ui (chatgpt clone with open ai api) and been able to run the fast api against it! the chat is pretty good other than the slower generation due to lack of chat mode :/ |
@keldenl awesome, yeah now that the mac install bugs |
lmk if i can help in parallel in any way 😀 |
Related to this - currently the completion prompt returns gibberish if the system prompt "You are a helpful assistant." is not set. It would be great if this could be omitted, similar to the actual OpenAI API. |
Update? |
i think the issue is you now need to specify the chat_format correctly ... it won't guess anymore. |
8c93cf8
to
cc0fe43
Compare
@earonesty correct, this is all handled correctly now by the chat format and chat handler APIs. |
The chat completion api specifically in fastapi wasn't doing a very consistent job in completing chat. The results seem to consistently generate gibberish (like
\nA\n/imagine prompt: User is asking about
, or just referencing to the system message in general), so I went ahead and tweaked the prompt (it was also weirdly formatted which probably confused the text generation even more).Here it is before and after with the default example (running
vicuna-13B unfiltered
:Before
Prompt
Results
After
Prompt
Results
I also followed the general guidance around default parameters for chatting in https://www.reddit.com/r/LocalLLaMA/comments/11o6o3f/how_to_install_llama_8bit_and_4bit/ to help with results as well.
Also added some .gitignore things that were specific to macOS that helps with contributing.