Skip to content

One chat prompt/template should be customizable from runtime - the prompt I need atm #816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pervrosen opened this issue Oct 12, 2023 · 7 comments
Labels
documentation Improvements or additions to documentation

Comments

@pervrosen
Copy link

Is your feature request related to a problem? Please describe.
Since the speedy evolement of LLMs and their various use of prompts, we need a way to specify a prompt from runtime, not only by code, eg mistral->mistral+orca->mistral-zephyr within a week

Describe the solution you'd like
An endpoint to override a prompt or add a runtime prompt for the model I am currently evaluating

Describe alternatives you've considered
submitting a pull request for each end ever LLM I encounter OR polling the "huggingface.co/docs/transformers/main/en/chat_templating"-feature for the model in question, but that introduces one more dependency

Additional context
An example: First, I run everything in a docker/k8s setup. I call upon an endpoint to specify the prompt, if not already specified in code. That populates a runtime version of

´´´
@register_chat_format("mistral")
def format_mistral(
messages: List[llama_types.ChatCompletionRequestMessage],
**kwargs: Any,
) -> ChatFormatterResponse:
_roles = dict(user="[INST] ", assistant="[/INST]")
_sep = " "
system_template = """{system_message}"""
system_message = _get_system_message(messages)
system_message = system_template.format(system_message=system_message)
_messages = _map_roles(messages, _roles)
_messages.append((_roles["assistant"], None))
_prompt = _format_no_colon_single(system_message, _messages, _sep)
return ChatFormatterResponse(prompt=_prompt)
´´´

that I can use for calling the model.

@tranhoangnguyen03
Copy link

tranhoangnguyen03 commented Oct 15, 2023

What if multiple users are using the server? Does it mean everyone gets to modify the same shared runtime prompt template, or does the server have to keep a different prompt template for each user?

@pervrosen
Copy link
Author

If that is a problem, perhaps one could limit usage by a feature flag. This would make sense especially since I suspect that when a model hits production a static version would have been implemented. Do you agree?

@tranhoangnguyen03
Copy link

Looks like you can build upon or contribute to this similar effort:
#809

@tranhoangnguyen03
Copy link

@pervrosen do you know what is the behavior of --chat_format on /v1/completion endpoint right now? It defaults to "llama-2". Does that mean every prompt sent to the endpoint is automatically applied llama-2 prompt format?

@pervrosen
Copy link
Author

If you search for chat_format in the repo you will find that some 8-10 models have prompt templates implemented. Hope the image I attached shows what I mean. Rgs

image

@earonesty
Copy link
Contributor

earonesty commented Nov 3, 2023

feels like the "chat_format" parametaer could just be jinja template. then just execute it. user can specify whatever .

for example this is "mistral openorca"

"{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<\|im_start\|>' + message['role'] + '\n' + message['content'] + '<\|im_end\|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<\|im_start\|>assistant\n' }}{% endif %}",

if everyone is cool with that i will add it (i will detect a jinja template as the chat format param, and execute it ... while still supporting the python registered formats as is)

then we can add that to gguf files in the metadata (absorb it during convert.py), and be done with having to think about templates!

@pervrosen
Copy link
Author

@earonesty I think this is exactly what they are discussing in #809 (link in an above comment). Perhaps you can contribute on the choice of arcitectural pattern that seems to be under debate there.

@abetlen abetlen added the documentation Improvements or additions to documentation label Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants