Skip to content

Fix v1/chat/completions Gibberish API Responses #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

keldenl
Copy link

@keldenl keldenl commented Apr 7, 2023

The chat completion api specifically in fastapi wasn't doing a very consistent job in completing chat. The results seem to consistently generate gibberish (like ​\nA\n/imagine prompt: User is asking about , or just referencing to the system message in general), so I went ahead and tweaked the prompt (it was also weirdly formatted which probably confused the text generation even more).

Here it is before and after with the default example (running vicuna-13B unfiltered:

Before
Prompt

 

### Instructions:Complete the following chat conversation between the user and the assistant. System messages should be strictly followed as additional instructions.

### Inputs:system None: You are a helpful assistant.
user None: What is the capital of France?

### Response:
assistant: 

Results

{
  "id": "chatcmpl-8d9ce5a6-841d-4568-acbe-67ea9640954a",
  "object": "chat.completion",
  "created": 1680854923,
  "model": "../llama.cpp/models/vicuna/13B/ggml-vicuna-unfiltered-13b-4bit.bin",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "​\nA\n/imagine prompt: User is asking about the capital of France, Assistant should provide a clear and concise answer, perhaps mentioning some interesting facts about the city or its history. The response should be friendly and helpful, using positive language and encouraging further questions. It should also include some basic information about Paris, such as its location in the north of France, its famous landmarks or cultural attractions, or its population and history.\n\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 70,
    "completion_tokens": 98,
    "total_tokens": 168
  }
}

After
Prompt

### Instructions:
Complete the following chat conversation between the user and the assistant. System messages should be strictly followed as additional instructions.

system None: You are a helpful assistant.
user None: What is the capital of France?

### Response:
assistant:

Results

{
  "id": "chatcmpl-35a2850c-e9cd-445b-ad63-046cb98cb107",
  "object": "chat.completion",
  "created": 1680854743,
  "model": "../llama.cpp/models/vicuna/13B/ggml-vicuna-unfiltered-13b-4bit.bin",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": " The capital of France is Paris.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 61,
    "completion_tokens": 12,
    "total_tokens": 73
  }
}

I also followed the general guidance around default parameters for chatting in https://www.reddit.com/r/LocalLLaMA/comments/11o6o3f/how_to_install_llama_8bit_and_4bit/ to help with results as well.

Also added some .gitignore things that were specific to macOS that helps with contributing.

@keldenl keldenl changed the title Improve chat completion api Fix Chat Completion API assistance response content Apr 7, 2023
@keldenl keldenl changed the title Fix Chat Completion API assistance response content Fix Chat Completion API assistant response content Apr 7, 2023
@keldenl keldenl changed the title Fix Chat Completion API assistant response content Fix Chat Completion API Responses Apr 7, 2023
@keldenl keldenl changed the title Fix Chat Completion API Responses Fix Chat Completion Gibberish API Responses Apr 7, 2023
@keldenl keldenl changed the title Fix Chat Completion Gibberish API Responses Fix /chat/completion Gibberish API Responses Apr 7, 2023
@keldenl keldenl changed the title Fix /chat/completion Gibberish API Responses Fix v1/chat/completions Gibberish API Responses Apr 7, 2023
@jmtatsch
Copy link

jmtatsch commented Apr 7, 2023

@keldenl any hints where one could find an unfiltered vicuna grazing? asking for a friend ...

@keldenl
Copy link
Author

keldenl commented Apr 7, 2023

@keldenl any hints where one could find an unfiltered vicuna grazing? asking for a friend ...

hug.. some.. faces?

@abetlen
Copy link
Owner

abetlen commented Apr 8, 2023

Thanks for the contribution I'll try to address this in a more general way with #17 by allowing you to load multiple models and set defaults based on the specific model

@abetlen
Copy link
Owner

abetlen commented Apr 8, 2023

Also, I haven't tested out the vicuna model yet but it looks very promising, I've found using alpaca for chat is less than ideal.

@keldenl
Copy link
Author

keldenl commented Apr 8, 2023

Vicuña has given me some good results. I've tweaked the chat-ui (chatgpt clone with open ai api) and been able to run the fast api against it! the chat is pretty good other than the slower generation due to lack of chat mode :/

@abetlen
Copy link
Owner

abetlen commented Apr 8, 2023

@keldenl awesome, yeah now that the mac install bugs
are fixed improving chat speed is definitely next on my list

@keldenl
Copy link
Author

keldenl commented Apr 8, 2023

lmk if i can help in parallel in any way 😀

@Niek
Copy link
Contributor

Niek commented Apr 12, 2023

Related to this - currently the completion prompt returns gibberish if the system prompt "You are a helpful assistant." is not set. It would be great if this could be omitted, similar to the actual OpenAI API.

@gjmulder
Copy link
Contributor

Update?

@gjmulder gjmulder added the bug Something isn't working label May 23, 2023
@earonesty
Copy link
Contributor

i think the issue is you now need to specify the chat_format correctly ... it won't guess anymore.

@abetlen abetlen force-pushed the main branch 2 times, most recently from 8c93cf8 to cc0fe43 Compare November 14, 2023 20:24
@abetlen
Copy link
Owner

abetlen commented Nov 21, 2023

@earonesty correct, this is all handled correctly now by the chat format and chat handler APIs.

@abetlen abetlen closed this Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants