Collaboration/Sponsorship: Improving SDXL Inference Performance in stable-diffusion.cpp

We’re exploring stable-diffusion.cpp at Civitai to better serve SDXL requests, but we’ve found inference times still need some work. We’d love to help improve this and are open to sponsoring development or collaborating with contributors here. 

@Green-Sky @wbruna @stduhpf  - since you’ve done great work on this project, I’d love to hear if you’d be interested in discussing ways to optimize performance together.

We generate millions of images a day but using raw comfy is really inefficient and even most python-based solutions have issues... Our aim is to maximize GPU use and reduce model swap time by pre-loading weights into VRAM so that we can maximize throughput.

In our initial tests, the load time is already much better, and we can use RAM disk to preload models to some degree, but the inference times can be roughly double which breaks any gains we had from the improved load times. 

We're new to the project, don't have any c++ specialists on the team, and honestly don't have the bandwidth to tackle this ourselves, but we'd love to see it get done and would be happy to try and chip in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Collaboration/Sponsorship: Improving SDXL Inference Performance in stable-diffusion.cpp #772

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Collaboration/Sponsorship: Improving SDXL Inference Performance in stable-diffusion.cpp #772

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions