-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Closed
Labels
bugSomething isn't workingSomething isn't workingstaleIssues that haven't received updatesIssues that haven't received updates
Description
Describe the bug
There are a few Stable Diffusion 1.5 models that use a prediction type of v_prediction
rather than epsilon
. In version 0.27.0, StableDiffusionPipeline.from_single_file()
correctly detected and rendered images from such models. However, in version 0.30.0, these models are always treated as epsilon
, even when the correct prediction_type
and original_config
arguments are set.
Reproduction
You will need to download the original config file, EasyFluffV11.yaml into the current directory for this to work. After running, the file sushi.png
will show incorrect rendering.
from diffusers import StableDiffusionPipeline
import torch
model_id = 'https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.safetensors'
yaml_path = './EasyFluffV11.yaml'
pipe = StableDiffusionPipeline.from_single_file(model_id,
original_config=yaml_path,
prediction_type='v_prediction',
torch_dtype=torch.float16,
).to("cuda")
prompt = "banana sushi"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("sushi.png")
Logs
Fetching 11 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 7330.37it/s]
Loading pipeline components...: 0%| | 0/6 [00:00<?, ?it/s]Some weights of the model checkpoint were not used when initializing CLIPTextModel:
['text_model.embeddings.position_ids']
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 26.26it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:01<00:00, 16.72it/s]
### System Info
- 🤗 Diffusers version: 0.30.0
- Platform: Linux-5.15.0-113-generic-x86_64-with-glibc2.35
- Running on Google Colab?: No
- Python version: 3.10.12
- PyTorch version (GPU?): 2.2.2+cu121 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.23.5
- Transformers version: 4.41.1
- Accelerate version: 0.31.0
- PEFT version: 0.11.1
- Bitsandbytes version: not installed
- Safetensors version: 0.4.3
- xFormers version: 0.0.25.post1
- Accelerator: NVIDIA GeForce RTX 4070, 12282 MiB
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
### Who can help?
@yiyixuxu @asomoza
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingstaleIssues that haven't received updatesIssues that haven't received updates
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
DN6 commentedon Aug 14, 2024
@lstein Can you share the image outputs from v0.29.2 and v0.30.0?
DN6 commentedon Aug 14, 2024
By any chance do you have
runwayml/stable-diffusion-v1-5
saved in your HF Cache directory?lstein commentedon Aug 14, 2024
My bad. The regression is present in 0.29.2 as well. The previous working version was 0.27.0. I have amended the bug report.
Here is the output from the script run with diffusers 0.27.0 vs 0.30.0. Also note the difference in image size. 0.27.0 apparently thinks this is an sd-2 model.
0.27.0


0.30.0
lstein commentedon Aug 14, 2024
Indeed yes. I've seen that
from_single_file()
downloads it into the cache if it isn't there already. This seems to be the way it gets the component.json
config files for the base model of the checkpoint file being loaded.DN6 commentedon Aug 16, 2024
Hi @lstein yes, we updated single file to rely on the model cache/configs to set up the pipleines. It enables us to support single file on a larger range for models. The
prediction_type
argument is deprecated and will be removed eventually. Although we should show a warning here. I will open a PR for it.I noticed that the scheduler in the repo you linked does contain a config that sets
v_prediction
. You can configure your pipeline in the following way to enable correct inference.lstein commentedon Aug 17, 2024
I'm a developer of InvokeAI, and am trying to support users who import arbitrary
.safetensors
models, so it will be difficult to find a general mechanism to identify the diffusers model with a config that matches what the safetensors file needs. Can you suggest how to do this?DN6 commentedon Aug 19, 2024
In most cases we can auto match to the appropriate config, provided that the
.safetensors
file is in the original format and not the diffusers format. If you check the keys of the single file checkpoint and the diffusers checkpoints you will notice that the keys are different.In this particular case you're setting the
prediction_type
argument anyway since the YAML configs do not contain that information either.You could configure a scheduler before hand with prediction type and set it in the pipeline.
e.g
from_single_file
operates on the assumption that you are trying to load a checkpoint saved in the original format. We could update/add a util function indiffusers.loader.single_file_utils
that raises an error if we can't match to an appropriate config . The current behaviour is to default to SD 1.5, which can be confusing.Do you happen to have a list of models that would need to support these arbitrary
.safetensors
files? Just so I understand your requirements a bit better?yiyixuxu commentedon Aug 19, 2024
the yaml file does specify the
v_prediction
though https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.yaml#L5should we consider adding a special check for this config when a yaml is passed? I think this is really an edge case where a fine-tuned checkpoint can have a different configuration from the base checkpoint
DN6 commentedon Aug 20, 2024
Ah my bad. Missed that. But even in earlier versions, we relied on the
prediction_type
argument to configure the scheduler. It wasn't set from the YAML.diffusers/src/diffusers/loaders/single_file_utils.py
Line 1546 in b69fd99
In the current version, setting via
prediction_type
only works iflocal_file_only=True
The reasoning was to encourage setting the prediction type via the Scheduler object and passing that object to the pipeline. Like we do for
from_pretrained
. I think I missed this potential path during the refactor, so it is a breaking change. We can add additional checks for legacy kwargs and update the loading, but these kwargs are slated to be removed and this is a bit of an edge case. I would recommend following the same configuration process asfrom_pretrained
when doing single file loading and configuring the scheduler object before hand or using theconfig
argument.yiyixuxu commentedon Aug 20, 2024
@lstein
can you let us know if the solution @DN6 proposed here works for you? #9171 (comment)
DN6 commentedon Aug 20, 2024
PR to address the current issue: #9229
github-actions commentedon Sep 14, 2024
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
DN6 commentedon Jan 16, 2025
Closing since #9229 was merged to fix the issue.