Skip to content

Collaboration/Sponsorship: Improving SDXL Inference Performance in stable-diffusion.cpp #772

@JustMaier

Description

@JustMaier

We’re exploring stable-diffusion.cpp at Civitai to better serve SDXL requests, but we’ve found inference times still need some work. We’d love to help improve this and are open to sponsoring development or collaborating with contributors here.

@Green-Sky @wbruna @stduhpf - since you’ve done great work on this project, I’d love to hear if you’d be interested in discussing ways to optimize performance together.

We generate millions of images a day but using raw comfy is really inefficient and even most python-based solutions have issues... Our aim is to maximize GPU use and reduce model swap time by pre-loading weights into VRAM so that we can maximize throughput.

In our initial tests, the load time is already much better, and we can use RAM disk to preload models to some degree, but the inference times can be roughly double which breaks any gains we had from the improved load times.

We're new to the project, don't have any c++ specialists on the team, and honestly don't have the bandwidth to tackle this ourselves, but we'd love to see it get done and would be happy to try and chip in.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions