The default values from these parameters are set to 10000 and 1: https://github.com/abetlen/llama-cpp-python/blob/a72efc77de29e5a0b551c2321e57ab68f79264cc/llama_cpp/llama.py#L232-L233 To use the model default values, these should be set to zero instead. This is the relevant code in llama.cpp: https://github.com/ggerganov/llama.cpp/blob/2777a84be429401a2b7d33c2b6a4ada1f0776f1b/llama.cpp#L6699-L6701 Setting an incorrect value may result in poor generation quality in models that use a different value for these parameters, such as CodeLlama.