How to disable prompt cache? #1608

mscheong01 · 2024-07-19T03:45:58Z

Hi, I'm running some offline inference benchmarks using llama-cpp-python, and the prompt cache that was implemented here (#158) is getting in the way of measuring prompt evaluation time. Is there an option to disable it?

mscheong01 · 2024-07-19T04:54:35Z

I was able to do this by calling model.reset()

mscheong01 closed this as completed Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to disable prompt cache? #1608

How to disable prompt cache? #1608

mscheong01 commented Jul 19, 2024

mscheong01 commented Jul 19, 2024

How to disable prompt cache? #1608

How to disable prompt cache? #1608

Comments

mscheong01 commented Jul 19, 2024

mscheong01 commented Jul 19, 2024