gguf new quant type support (with demo) #12076

calcuis · 2025-08-05T20:20:09Z

not perfect but works; thanks @a-r-r-o-w @DN6

engine:
https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/quant2c.py

inference example(s):
https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k6.py https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k5.py

gguf file sample(s):
https://huggingface.co/calcuis/kontext-gguf/tree/main https://huggingface.co/calcuis/krea-gguf/tree/main

What does this PR do?

Before submitting

Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.

calcuis/gguf-connector#3

Did you write any new necessary tests?

could simply test it with the the inference example(s) above or the code below:

import torch
from transformers import T5EncoderModel
from diffusers import FluxPipeline, GGUFQuantizationConfig, FluxTransformer2DModel

model_path = "https://huggingface.co/calcuis/krea-gguf/blob/main/flux1-krea-dev-iq4_nl.gguf"
transformer = FluxTransformer2DModel.from_single_file(
    model_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16,
    config="callgg/krea-decoder",
    subfolder="transformer"
)

text_encoder = T5EncoderModel.from_pretrained(
    "chatpig/t5-v1_1-xxl-encoder-fp32-gguf",
    gguf_file="t5xxl-encoder-fp32-q2_k.gguf",
    torch_dtype=torch.bfloat16
    )

pipe = FluxPipeline.from_pretrained(
    "callgg/krea-decoder",
    transformer=transformer,
    text_encoder_2=text_encoder,
    torch_dtype=torch.bfloat16
    )
pipe.enable_model_cpu_offload() # could change it to cuda if you have good gpu

prompt = "a pig holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=2.5,
).images[0]
image.save("output.png")

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

not perfect but works engine: https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/quant2c.py inference example(s): https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k6.py https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k5.py gguf file sample(s): https://huggingface.co/calcuis/kontext-gguf/tree/main https://huggingface.co/calcuis/krea-gguf/tree/main

gguf new quant type support (with demo) - Update utils.py

calcuis added 3 commits August 3, 2025 01:02

Merge pull request #1 from calcuis/calcuis-patch-2

d64bc8c

gguf new quant type support (with demo) - Update utils.py

Merge branch 'huggingface:main' into main

22b171c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gguf new quant type support (with demo) #12076

gguf new quant type support (with demo) #12076

calcuis commented Aug 5, 2025

Uh oh!

Uh oh!

gguf new quant type support (with demo) #12076

Are you sure you want to change the base?

gguf new quant type support (with demo) #12076

Conversation

calcuis commented Aug 5, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

Uh oh!