Train Text-image.py KeyError: Shortest Edge when resuming training from local SD 1.5
Describe the bug
I am unable to resume training of a locally saved SD 1.5 model (trained using the same text-image.py script, albeit likely an earlier revision). There is an error when loading the CLIP component of the model pipeline. Here is the CLIP configuration:
And the model index:
"_class_name": "StableDiffusionPipeline", "_diffusers_version": "0.27.0.dev0", "_name_or_path": "/content/drive/MyDrive/SDTest", "feature_extractor": [ "transformers", "CLIPImageProcessor" ], "image_encoder": [ null, null ], "requires_safety_checker": true, "safety_checker": [ "stable_diffusion", "StableDiffusionSafetyChecker" ], "scheduler": [ "diffusers", "PNDMScheduler" ], "text_encoder": [ "transformers", "CLIPTextModel" ], "tokenizer": [ "transformers", "CLIPTokenizer" ], "unet": [ "diffusers", "UNet2DConditionModel" ], "vae": [ "diffusers", "AutoencoderKL" ] }
Reproduction
%%shell export MODEL_NAME='path/to/localy/saved model' export TRAIN_DIR='/content/Dataset'
accelerate launch --mixed_precision="fp16" train_text_to_image.py
--pretrained_model_name_or_path=$MODEL_NAME
--dataset_name=$TRAIN_DIR
--use_ema
--resolution=512 --random_flip
--resume_from_checkpoint='latest'
--train_batch_size=8
--caption_column='text'
--max_train_steps=15000
--learning_rate=1e-05
--max_grad_norm=1
--lr_scheduler='constant' --lr_warmup_steps=0
--report_to='wandb'
--output_dir=$MODEL_NAME \
Logs
Loading pipeline components...: 57% 4/7 [00:19<00:14, 5.00s/it]
Traceback (most recent call last):
File "/content/diffusers/examples/text_to_image/train_text_to_image.py", line 1123, in <module>
main()
File "/content/diffusers/examples/text_to_image/train_text_to_image.py", line 1079, in main
pipeline = StableDiffusionPipeline.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 1265, in from_pretrained
loaded_sub_model = load_sub_model(
File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 533, in load_sub_model
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/image_processing_utils.py", line 207, in from_pretrained
return cls.from_dict(image_processor_dict, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/image_processing_utils.py", line 413, in from_dict
image_processor = cls(**image_processor_dict)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/clip/image_processing_clip.py", line 127, in __init__
self.size = {"height": size["shortest_edge"], "width": size["shortest_edge"]}
KeyError: 'shortest_edge'
Steps: 26000it [00:22, ?it/s]
System Info
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
-
diffusersversion: 0.27.0.dev0 - Platform: Linux-6.1.58+-x86_64-with-glibc2.35
- Python version: 3.10.12
- PyTorch version (GPU?): 2.1.0+cu121 (True)
- Huggingface_hub version: 0.20.3
- Transformers version: 4.38.1
- Accelerate version: 0.27.2
- xFormers version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Who can help?
@sayak
Does this happen when you use the latest version of the script? I am unable to reproduce the problem. Make sure you're using the latest stable versions of the accelerate, diffusers, and transformers libraries.
And it also looks like it should reside in the transformers repository.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.