Skip to content
This repository was archived by the owner on Jun 24, 2024. It is now read-only.

Add classifier-free guidance #377

Open
philpax opened this issue Jul 17, 2023 · 2 comments
Open

Add classifier-free guidance #377

philpax opened this issue Jul 17, 2023 · 2 comments
Labels
issue:enhancement New feature or request

Comments

@philpax
Copy link
Collaborator

philpax commented Jul 17, 2023

llama.cpp has recently developed support for CFG:

We should mirror this support. I'm not sure how well it will apply to the other models; I haven't investigated too deeply into this.

@philpax philpax added the issue:enhancement New feature or request label Jul 17, 2023
@Vermeille
Copy link

Make sure to remove the smooth_factor and the last log_softmax in order to remain consistent with llama.cpp's and HF's implementation ( ggml-org/llama.cpp#2280 )

@KerfuffleV2
Copy link
Contributor

I'm going to try to look at how to add this to llm-samplers. It will need the CFG logits though, so llm will need to handle that itself. I guess it can be supplied as a sampler resource similar to the RNG and last tokens. I'd like to figure out a more general way to handle resources but in the worst case I can just add another type of resource to that trait.

llama.cpp CFG sampler for reference (doesn't look too complicated): https://github.com/ggerganov/llama.cpp/blob/b19edd54d51cef5e3616c18b1d0d8626895b2cba/llama.cpp#L2709-L2740

On the llm side it looks like you have to maintain a guidance context and run the model for both contexts every token — so using CFG means evaluating the model is twice as slow (also, I think you need two K/V caches). Main relevant sections from llama.cpp's main example: https://github.com/ggerganov/llama.cpp/blob/b19edd54d51cef5e3616c18b1d0d8626895b2cba/examples/main/main.cpp#L208-L215 and https://github.com/ggerganov/llama.cpp/blob/b19edd54d51cef5e3616c18b1d0d8626895b2cba/examples/main/main.cpp#L484-L523

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue:enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants