Skip to content

starcoder -- not enough space in the context's memory pool #158

@bluecoconut

Description

@bluecoconut

I'm getting errors with starcoder models when I try to include any non-trivial amount of tokens. I'm getting this with both my raw model (direct .bin) and quantized model regardless of version (pre Q4/Q5 changes and post Q4/Q5 changes).

Relevant error:

ggml_new_tensor_impl: not enough space in the context's memory pool (needed 412241472, available 411790368)

Example:

./build/bin/starcoder -m /workspaces/research/models/starcoder/starcoder-ggml.bin -p "def fibo( fibo fib fibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test " --top_k 0 --top_p 0.95 --temp 0.2 

will cause the error

main: seed = 1684223471
starcoder_model_load: loading model from '/workspaces/research/models/starcoder/starcoder-ggml.bin'
starcoder_model_load: n_vocab = 49152
starcoder_model_load: n_ctx   = 8192
starcoder_model_load: n_embd  = 6144
starcoder_model_load: n_head  = 48
starcoder_model_load: n_layer = 40
starcoder_model_load: ftype   = 1
starcoder_model_load: qntvr   = 0
starcoder_model_load: ggml ctx size = 51276.47 MB
starcoder_model_load: memory size = 15360.00 MB, n_mem = 327680
starcoder_model_load: model size  = 35916.23 MB
main: prompt: 'def fibo( fibo fib fibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test '
main: number of tokens in prompt = 51, first 8 tokens: 589 28176 97 26 28176 97 28176 28176 

def fibo( fibo fib fibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibggml_new_tensor_impl: not enough space in the context's memory pool (needed 412241472, available 411952576)
Segmentation fault (core dumped)

(Here's another output from the quantized model)

vscode ➜ /workspaces/research/others/ggml (master) $ ./build/bin/starcoder -m /workspaces/research/models/starcoder/starcoder-ggml-q4_1.bin -p "def fibo( fibo fib fibo test wate
rfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test " --top_k 0 --top_p 0.95 --temp 0.2 
main: seed = 1684223600
starcoder_model_load: loading model from '/workspaces/research/models/starcoder/starcoder-ggml-q4_1.bin'
starcoder_model_load: n_vocab = 49152
starcoder_model_load: n_ctx   = 8192
starcoder_model_load: n_embd  = 6144
starcoder_model_load: n_head  = 48
starcoder_model_load: n_layer = 40
starcoder_model_load: ftype   = 1003
starcoder_model_load: qntvr   = 1
starcoder_model_load: ggml ctx size = 28956.47 MB
starcoder_model_load: memory size = 15360.00 MB, n_mem = 327680
starcoder_model_load: model size  = 13596.23 MB
main: prompt: 'def fibo( fibo fib fibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test '
main: number of tokens in prompt = 51, first 8 tokens: 589 28176 97 26 28176 97 28176 28176 

def fibo( fibo fib fibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibo test waterfibggml_new_tensor_impl: not enough space in the context's memory pool (needed 412241472, available 411790368)
Segmentation fault (core dumped)

Best I can find in the past was ggml-org/llama.cpp#29

But, maybe that was fixed in llama models, but the problem has returned for starcoder?

Based on: #146

Specifically hoping that @NouamaneTazi might have some clarity on why this might be happening?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions