Reading GGUF metadata with gguf-dump.py does not work for i-quants

The [gguf-dump.py](https://github.com/ggerganov/llama.cpp/blob/b2297/gguf-py/scripts/gguf-dump.py) script in the llama.cpp release [b2297](https://github.com/ggerganov/llama.cpp/releases/tag/b2297) is missing support for i-quants. 

#### Steps to reproduce

1. Create or download a GGUF file in any `IQ*` format (e.g., [miqu-1-70b-Requant-b2131-iMat-c32_ch400-IQ1_S_v3.gguf](https://huggingface.co/Nexesenex/MIstral-QUantized-70b_Miqu-1-70b-iMat.GGUF/resolve/main/miqu-1-70b-Requant-b2131-iMat-c32_ch400-IQ1_S_v3.gguf))
2. Copy the file to `.\models\miqu-1-70b-sf.IQ1_S.gguf`
3. Execute the following

```Shell
python .\gguf-py\scripts\gguf-dump.py --no-tensors .\models\miqu-1-70b-sf.IQ1_S.gguf
```
4. See the error:
```
ValueError: 19 is not a valid GGMLQuantizationType
```
#### Expected behaviour

I expect the Python `gguf-py` library to support all possible GGUF formats. 

Working example for k-quants:

```Shell
python .\gguf-py\scripts\gguf-dump.py --no-tensors .\models\miqu-1-70b-sf.Q5_K_M.gguf
```

```
* Loading: .\models\miqu-1-70b-sf.Q5_K_M.gguf
* File is LITTLE endian, script is running on a LITTLE endian host.

* Dumping 26 key/value pair(s)
      1: UINT32     |        1 | GGUF.version = 3
      2: UINT64     |        1 | GGUF.tensor_count = 723
      3: UINT64     |        1 | GGUF.kv_count = 23
      4: STRING     |        1 | general.architecture = 'llama'
      5: STRING     |        1 | general.name = 'R:\\AI\\LLM\\source'
      6: UINT32     |        1 | llama.context_length = 32764
      7: UINT32     |        1 | llama.embedding_length = 8192
      8: UINT32     |        1 | llama.block_count = 80
      9: UINT32     |        1 | llama.feed_forward_length = 28672
     10: UINT32     |        1 | llama.rope.dimension_count = 128
     11: UINT32     |        1 | llama.attention.head_count = 64
     12: UINT32     |        1 | llama.attention.head_count_kv = 8
     13: FLOAT32    |        1 | llama.attention.layer_norm_rms_epsilon = 9.999999747378752e-06
     14: FLOAT32    |        1 | llama.rope.freq_base = 1000000.0
     15: UINT32     |        1 | general.file_type = 17
     16: STRING     |        1 | tokenizer.ggml.model = 'llama'
     17: [STRING]   |    32000 | tokenizer.ggml.tokens
     18: [FLOAT32]  |    32000 | tokenizer.ggml.scores
     19: [INT32]    |    32000 | tokenizer.ggml.token_type
     20: UINT32     |        1 | tokenizer.ggml.bos_token_id = 1
     21: UINT32     |        1 | tokenizer.ggml.eos_token_id = 2
     22: UINT32     |        1 | tokenizer.ggml.padding_token_id = 0
     23: BOOL       |        1 | tokenizer.ggml.add_bos_token = True
     24: BOOL       |        1 | tokenizer.ggml.add_eos_token = False
     25: STRING     |        1 | tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}{% if (message['"
     26: UINT32     |        1 | general.quantization_version = 2
```

#### Use-Case

I am extracting the metadata from any given GGUF model to automatically calculate the optimal runtime arguments for the server in the following PowerShell script: https://github.com/countzero/windows_llama.cpp/blob/v1.12.0/examples/server.ps1#L104

#### Question

[@ggerganov](https://github.com/ggerganov) Is there another way to _only_ dump the metadata from a given GGUF model? Perhaps this could be an `--inspect` option of the [gguf](https://github.com/ggerganov/llama.cpp/blob/master/examples/gguf/gguf.cpp) binary?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reading GGUF metadata with gguf-dump.py does not work for i-quants #5809

Steps to reproduce

Expected behaviour

Use-Case

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reading GGUF metadata with gguf-dump.py does not work for i-quants #5809

Description

Steps to reproduce

Expected behaviour

Use-Case

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions