diffusers Train Text-image.py KeyError: Shortest Edge when resuming training from local SD 1.5

Describe the bug

I am unable to resume training of a locally saved SD 1.5 model (trained using the same text-image.py script, albeit likely an earlier revision). There is an error when loading the CLIP component of the model pipeline. Here is the CLIP configuration: And the model index: "_class_name": "StableDiffusionPipeline", "_diffusers_version": "0.27.0.dev0", "_name_or_path": "/content/drive/MyDrive/SDTest", "feature_extractor": [ "transformers", "CLIPImageProcessor" ], "image_encoder": [ null, null ], "requires_safety_checker": true, "safety_checker": [ "stable_diffusion", "StableDiffusionSafetyChecker" ], "scheduler": [ "diffusers", "PNDMScheduler" ], "text_encoder": [ "transformers", "CLIPTextModel" ], "tokenizer": [ "transformers", "CLIPTokenizer" ], "unet": [ "diffusers", "UNet2DConditionModel" ], "vae": [ "diffusers", "AutoencoderKL" ] }

Reproduction

%%shell export MODEL_NAME='path/to/localy/saved model' export TRAIN_DIR='/content/Dataset'

accelerate launch --mixed_precision="fp16" train_text_to_image.py
--pretrained_model_name_or_path=$MODEL_NAME
--dataset_name=$TRAIN_DIR
--use_ema
--resolution=512 --random_flip
--resume_from_checkpoint='latest'
--train_batch_size=8
--caption_column='text'
--max_train_steps=15000
--learning_rate=1e-05
--max_grad_norm=1
--lr_scheduler='constant' --lr_warmup_steps=0
--report_to='wandb'
--output_dir=$MODEL_NAME \

Logs

Loading pipeline components...:  57% 4/7 [00:19<00:14,  5.00s/it]
Traceback (most recent call last):
  File "/content/diffusers/examples/text_to_image/train_text_to_image.py", line 1123, in <module>
    main()
  File "/content/diffusers/examples/text_to_image/train_text_to_image.py", line 1079, in main
    pipeline = StableDiffusionPipeline.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 1265, in from_pretrained
    loaded_sub_model = load_sub_model(
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 533, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/image_processing_utils.py", line 207, in from_pretrained
    return cls.from_dict(image_processor_dict, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/image_processing_utils.py", line 413, in from_dict
    image_processor = cls(**image_processor_dict)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/clip/image_processing_clip.py", line 127, in __init__
    self.size = {"height": size["shortest_edge"], "width": size["shortest_edge"]}
KeyError: 'shortest_edge'
Steps: 26000it [00:22, ?it/s]

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

diffusers version: 0.27.0.dev0
Platform: Linux-6.1.58+-x86_64-with-glibc2.35
Python version: 3.10.12
PyTorch version (GPU?): 2.1.0+cu121 (True)
Huggingface_hub version: 0.20.3
Transformers version: 4.38.1
Accelerate version: 0.27.2
xFormers version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

@sayak

Mar 03 '24 21:03 suggestiondiabolique

Does this happen when you use the latest version of the script? I am unable to reproduce the problem. Make sure you're using the latest stable versions of the accelerate, diffusers, and transformers libraries.

And it also looks like it should reside in the transformers repository.

Mar 04 '24 03:03 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Apr 03 '24 15:04 github-actions[bot]