What should be the value to be used for the "num_class-mages" while training the stable diffusion model?

Open sivaramakrishnan-rajaraman opened this issue 1 year ago • 0 comments

I read from the hugging face docs that the parameter "num_class_images" refers to the "Minimal" class images for prior preservation loss. If with_prior_preservation = True and there are not enough images already present in class_data_dir, additional images will be sampled with class_prompt.

export MODEL_NAME="CompVis/stable-diffusion-v2-1"
export INSTANCE_DIR="path_to_train"
export CLASS_DIR="path_to_class"
export OUTPUT_DIR="path_to_model"

accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=0.8 \
  --instance_prompt="a photo of sks cat" \
  --class_prompt="a photo of cat" \
  --resolution=512 \
  --train_batch_size=2 \
  --gradient_accumulation_steps=1 \
  --learning_rate=2e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=200 \
  --max_train_steps=800

By default, I see that the "num_class_images" parameter is set to 200. In my case, I set prior_preservation to True. I have 300 images in the instance directory and 900 images in the class directory. My question is, if I use this value of 200 for "num_class_images", is it only going to sample only 200 images from my class directory? I would wish the model learns the general characteristics from all the images (i.e., 900), in the class directory. In that case, what should be the value I use for the "num_class_images" parameter?

Apr 23 '24 13:04 sivaramakrishnan-rajaraman