-
Notifications
You must be signed in to change notification settings - Fork 12.7k
Closed as not planned
Labels
bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
Description
What happened?
./llama-server -ngl 99 -cb -c 65536 -np 32 -m models/Phi-3-mini-128k-instruct/ggml-model-bf16.gguf
...
GGML_ASSERT: ggml/src/ggml-cuda.cu:1257: to_fp32_cuda != nullptr
[New LWP 934430]
[New LWP 934432]
[New LWP 934433]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fb1ba523c7f in __GI___wait4 (pid=934542, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0 0x00007fb1ba523c7f in __GI___wait4 (pid=934542, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27 in ../sysdeps/unix/sysv/linux/wait4.c
#1 0x0000559119a6c7eb in ggml_print_backtrace ()
#2 0x000055911992c1b5 in ggml_cuda_op_mul_mat_cublas(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char const*, float const*, char const*, float*, long, long, long, long, CUstream_st*) ()
#3 0x000055911992e781 in ggml_cuda_op_mul_mat(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, void (*)(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char const*, float const*, char const*, float*, long, long, long, long, CUstream_st*), void (*)(float const*, void*, long, long, long, long, ggml_type, CUstream_st*)) ()
#4 0x000055911992f7a5 in ggml_cuda_mul_mat(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*) ()
#5 0x0000559119933cff in ggml_backend_cuda_graph_compute(ggml_backend*, ggml_cgraph*) ()
#6 0x0000559119abb4bb in ggml_backend_sched_graph_compute_async ()
#7 0x0000559119b0d7b0 in llama_decode ()
#8 0x0000559119bcd039 in llama_init_from_gpt_params(gpt_params&) ()
#9 0x0000559119c78495 in server_context::load_model(gpt_params const&) ()
#10 0x0000559119913d7a in main ()
[Inferior 1 (process 934429) detached]
./start_phi.sh: line 1: 934429 Aborted
Name and Version
./llama-server --version
version: 3265 (72272b8)
built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux, Windows
Relevant log output
No response
Metadata
Metadata
Assignees
Labels
bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)