-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Misc] Logits processor plugins #4769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc] Logits processor plugins #4769
Conversation
I added some documentation about this feature :) |
This looks cool - a distribution mechanism for logit processors. When #4775 gets merged this PR would need to be updated to support the more generic interface. |
I am very much in favor of this approach. A few months ago I tried to get a similar concept in huggingface-tgi: |
I like this idea. And I agree with @mmoskal that it would be important to support the more involved API being worked on in #4775. I wonder though how one would implement support for the OpenAI API on tool use if guided decoding were to be provided by such a plugin. The code on the OpenAI server depends on the guided decoding backend and will need to know how to transform the OpenAI API conformant parameters into valid guided decoding parameters (c.f. #4656). Supporting the OpenAI API as thoroughly as possible is a very valuable thing that should not be sacrificed for software-architectural reasons. So we can either define guided decoding as a core vLLM feature that is not in the scope of logit-processor plugins or we can think about e.g. also making the frontend part necessary to "correctly" use the plugins also pluggable. Latter would be a challenging endeavor. |
Thank you for the feedback everyone. Regarding @br3no response: It's a good point, I believe as a first step it does make sense to keep the guided decoding code as core vLLM logic, and even more so as it's already implemented this way. I will try and think how it would be possible to implement it as plugins but still allow tool calling, but I believe this pull request is valuable both ways :) |
@DarkLight1337 @simon-mo |
Just to mention here that to properly support stateful logits processors we are proposing to change the API to take logits processor factories rather than logts processor instances, see #5329. |
This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you! |
This pull request has merge conflicts that must be resolved before it can be |
Closing as stale. If you plan to continue this work, feel free to re-open. |
This pull request adds support for Logits processor plugins.
This makes implementing custom Logits processors very easy, and eliminates the need to change vLLM directly to implement it.
For example with this merge request we could implement all of the guided decoding features, just by implementing a Python package and installing it in the same virtualenv as vLLM, without actually changing vLLM source code.
Example code for a logits processor plugin that given a token id multiplies its logit by 100:
And the
setup.py
file for the package should look something like this:With this merge request vLLM will load all the plugins at startup, and each inference request can specify usage of custom logits processors using the
logits_processors
field in the request body.The
parameters_model
in the plugin dictionary is used to validate and parse the request body.I will soon add to this pull request a page in the documentation explaining how to implement custom logits processors.