Skip to content

Misc. bug: Server crash with use of lora on CPU #12587

@amakropoulos

Description

@amakropoulos

Name and Version

version: 4960 (fd7855f)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server -m qwen2-0_5b-instruct-q4_k_m.gguf -c 8192 -b 512 -np 1 --lora Qwen2-0.5B-Instruct-ru-lora.gguf

Problem description & steps to reproduce

The server crashes when using a lora in the latest master with segmentation fault when running on CPU (AVX2).

Model: qwen2-0_5b-instruct-q4_k_m.gguf
Lora: Qwen2-0.5B-Instruct-ru-lora.gguf

I tracked down the commit that causes the issue: 3d82dbc.
If the commit is reverted the problem is fixed.

First Bad Commit

3d82dbc

Relevant log output

Stacktrace:

llama_adapter_lora_init_impl: loading lora adapter from '/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf' ...
llama_adapter_lora_init_impl: CPU_Mapped LoRA buffer size =    14.67 MiB
llama_adapter_lora_init_impl: CPU_AARCH64 LoRA buffer size =     2.11 MiB

Thread 1 "llama-server" received signal SIGSEGV, Segmentation fault.
0x00007ffff7ecc32a in ggml_backend_cpu_aarch64_buffer_set_tensor (buffer=0x5555581abd90, tensor=0x5555564b2700, data=0x5555564d0150, offset=0, size=155648) at /home/benuix/codes/llama.cpp/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:5632
5632	    auto OK            = tensor_traits->repack(tensor, data, size);
(gdb) where
#0  0x00007ffff7ecc32a in ggml_backend_cpu_aarch64_buffer_set_tensor (buffer=0x5555581abd90, 
    tensor=0x5555564b2700, data=0x5555564d0150, offset=0, size=155648)
    at /home/benuix/codes/llama.cpp/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:5632
#1  0x00007ffff772ac03 in ggml_backend_tensor_set (tensor=0x5555564b2700, data=0x5555564d0150, 
    offset=0, size=155648) at /home/benuix/codes/llama.cpp/ggml/src/ggml-backend.cpp:268
#2  0x00007ffff7b6dcc7 in operator() (__closure=0x7fffffff94b0, orig=0x555555b53a80, dev=0x5555564b2700)
    at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:316
#3  0x00007ffff7b6efc0 in llama_adapter_lora_init_impl (model=..., 
    path_lora=0x555555adf860 "/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf", 
    adapter=...) at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:321
#4  0x00007ffff7b6f619 in llama_adapter_lora_init (model=0x555555b13930, 
    path_lora=0x555555adf860 "/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf")
    at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:333
#5  0x000055555582ab0a in common_init_from_params (params=...)
    at /home/benuix/codes/llama.cpp/common/common.cpp:993
#6  0x0000555555645333 in server_context::load_model (this=0x7fffffffc370, params=...)
    at /home/benuix/codes/llama.cpp/examples/server/server.cpp:1849
#7  0x000055555560127d in main (argc=11, argv=0x7fffffffdb28)
    at /home/benuix/codes/llama.cpp/examples/server/server.cpp:4488

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions