-
Notifications
You must be signed in to change notification settings - Fork 12.8k
Closed
Labels
good first issueGood for newcomersGood for newcomersperformanceSpeed related topicsSpeed related topics
Description
Add an example implementing the "Prompt Lookup Decoding" technique:
https://github.com/apoorvumang/prompt-lookup-decoding
This should be a great exercise for people looking to become familiar with llama.cpp
's KV cache management and batched decoding API. Looking for contributions.
The following examples can be used as starting points:
speculative
lookahead
batched
adobe-genai-workshop, 0xdevalias, apoorvumang, kektobiologist, lin72h and 2 more
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomersperformanceSpeed related topicsSpeed related topics