Skip to content

Wish to convert flan-t5 model to GGUF format #3393

@niranjanakella

Description

@niranjanakella

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I was trying to convert google/flan-t5-large model to GGUF format using this colab.

I am importing the model this way

model_name = 'google/flan-t5-large'

model = AutoModelForSeq2SeqLM.from_pretrained(
    model_name,
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map='cpu',
    offload_folder='offload',
    cache_dir=cache_dir
)

Current Behavior

I know that the current convert.py execution fails cause this type of model isn't supported
current error:

Loading model file models/pytorch_model.bin
Traceback (most recent call last):
  File "/content/llama.cpp/convert.py", line 1208, in <module>
    main()
  File "/content/llama.cpp/convert.py", line 1157, in main
    params = Params.load(model_plus)
  File "/content/llama.cpp/convert.py", line 288, in load
    params = Params.loadHFTransformerJson(model_plus.model, hf_config_path)
  File "/content/llama.cpp/convert.py", line 203, in loadHFTransformerJson
    n_embd           = config["hidden_size"]
KeyError: 'hidden_size'

Environment and Context

I am currently running all of this in a google colab notebook

  • SDK version, e.g. for Linux:
Python 3.10.12
GNU Make 4.3
Built for x86_64-pc-linux-gnu
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

I request help to accomplish this conversion. Can someone please suggest a method to convert this flan model to GGUF.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions