Releases: l3utterfly/llama.cpp
Releases · l3utterfly/llama.cpp
b6029
b5891
llama : add jinja template for rwkv-world (#14665) * llama : add jinja template for rwkv-world Signed-off-by: Molly Sophia <[email protected]> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Signed-off-by: Molly Sophia <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>
b5871
readme : add hot PRs (#14636) * readme : add hot PRs * cont * readme : update title * readme : hot PRs links * cont
b5835
vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (#14485) Commit taken from remyoudompheng's PR https://github.com/ggml-org/llama.cpp/pull/12260 Co-authored-by: Rémy Oudompheng <[email protected]>
b5581
opencl: add `backend_synchronize` (#13939) * This is not needed by the normal use where the result is read using `tensor_get`, but it allows perf mode of `test-backend-ops` to properly measure performance.
b5416
CANN: Support MOE Model MUL_MAT_ID (#13042) Signed-off-by: noemotiovon <[email protected]>
b5158
Disable CI cross-compile builds (#13022)
b5061
musa: fix compilation warnings in mp_22/31 (#12780) Signed-off-by: Xiaodong Ye <[email protected]>
b4959
convert: fix Mistral3/Gemma3 model hparams init (#12571) * Fix Mistral3/Gemma3 model hparams init * set positional args correctly * use existing hparams if passed
b4913
SYCL: using graphs is configurable by environment variable and compil…