-
Notifications
You must be signed in to change notification settings - Fork 12.7k
Closed
Labels
Description
Name and Version
version: 4960 (fd7855f)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
llama-server -m qwen2-0_5b-instruct-q4_k_m.gguf -c 8192 -b 512 -np 1 --lora Qwen2-0.5B-Instruct-ru-lora.gguf
Problem description & steps to reproduce
The server crashes when using a lora in the latest master with segmentation fault when running on CPU (AVX2).
Model: qwen2-0_5b-instruct-q4_k_m.gguf
Lora: Qwen2-0.5B-Instruct-ru-lora.gguf
I tracked down the commit that causes the issue: 3d82dbc.
If the commit is reverted the problem is fixed.
First Bad Commit
Relevant log output
Stacktrace:
llama_adapter_lora_init_impl: loading lora adapter from '/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf' ...
llama_adapter_lora_init_impl: CPU_Mapped LoRA buffer size = 14.67 MiB
llama_adapter_lora_init_impl: CPU_AARCH64 LoRA buffer size = 2.11 MiB
Thread 1 "llama-server" received signal SIGSEGV, Segmentation fault.
0x00007ffff7ecc32a in ggml_backend_cpu_aarch64_buffer_set_tensor (buffer=0x5555581abd90, tensor=0x5555564b2700, data=0x5555564d0150, offset=0, size=155648) at /home/benuix/codes/llama.cpp/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:5632
5632 auto OK = tensor_traits->repack(tensor, data, size);
(gdb) where
#0 0x00007ffff7ecc32a in ggml_backend_cpu_aarch64_buffer_set_tensor (buffer=0x5555581abd90,
tensor=0x5555564b2700, data=0x5555564d0150, offset=0, size=155648)
at /home/benuix/codes/llama.cpp/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:5632
#1 0x00007ffff772ac03 in ggml_backend_tensor_set (tensor=0x5555564b2700, data=0x5555564d0150,
offset=0, size=155648) at /home/benuix/codes/llama.cpp/ggml/src/ggml-backend.cpp:268
#2 0x00007ffff7b6dcc7 in operator() (__closure=0x7fffffff94b0, orig=0x555555b53a80, dev=0x5555564b2700)
at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:316
#3 0x00007ffff7b6efc0 in llama_adapter_lora_init_impl (model=...,
path_lora=0x555555adf860 "/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf",
adapter=...) at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:321
#4 0x00007ffff7b6f619 in llama_adapter_lora_init (model=0x555555b13930,
path_lora=0x555555adf860 "/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf")
at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:333
#5 0x000055555582ab0a in common_init_from_params (params=...)
at /home/benuix/codes/llama.cpp/common/common.cpp:993
#6 0x0000555555645333 in server_context::load_model (this=0x7fffffffc370, params=...)
at /home/benuix/codes/llama.cpp/examples/server/server.cpp:1849
#7 0x000055555560127d in main (argc=11, argv=0x7fffffffdb28)
at /home/benuix/codes/llama.cpp/examples/server/server.cpp:4488