-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Closed
Labels
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
I downloaded nvidia/Llama-3_1-Nemotron-51B-Instruct
but I am getting this error:
python3 convert_hf_to_gguf.py ~/Llama-3_1-Nemotron-51B-Instruct/ --outfile ~/Llama-3_1-Nemotron-51B-Instruct.f16.gguf --outtype f16
INFO:hf-to-gguf:Loading model: Llama-3_1-Nemotron-51B-Instruct
ERROR:hf-to-gguf:Model DeciLMForCausalLM is not supported
Motivation
Is this DeciLMForCausalLM model type going to be supported soon? It seems like the Q4_0 of this model can fit in 3090/4090 by offloading a few layers to CPU, a pretty good use case of llama.cpp.
Possible Implementation
No response