Fix: Support --train_text_encoder in train_dreambooth_sd3.py by handling None tokenizers/text_input_ids #12376
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR fixes the
--train_text_encoder
flag intrain_dreambooth_sd3.py
, which previously failed during training because theencode_prompt
and_encode_prompt_with_t5
functions did not properly handle cases wheretokenizers
ortext_input_ids
areNone
.When
--train_text_encoder
is enabled, the training pipeline pre-tokenizes prompts and passestext_input_ids_list
directly to avoid retokenizing in every step — but the original code assumedtokenizers
was always available, causing crashes like:This PR:
✅ Makes
_encode_prompt_with_t5
andencode_prompt
robust toNone
tokenizer by:text_input_ids
as fallback✅ Ensures compatibility between training and inference code paths
Fixes # ( 8507)