diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

ValueError: --image_column' value 'image' needs to be one of: text

Open SoumyaMB10 opened this issue 1 year ago • 13 comments

Describe the bug

env: MODEL_NAME=runwayml/stable-diffusion-v1-5 env: INSTANCE_DIR=/content/drive/MyDrive/Newfolder env: HF_ENDPOINT=https://hf-mirror.com/ 2024-08-18 08:46:08.308678: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-08-18 08:46:08.328601: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-08-18 08:46:08.334721: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-08-18 08:46:09.559880: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 08/18/2024 08:46:10 - INFO - main - Distributed environment: NO Num processes: 1 Process index: 0 Local process index: 0 Device: cuda

Mixed precision type: no

{'dynamic_thresholding_ratio', 'variance_type', 'clip_sample_range', 'sample_max_value', 'thresholding', 'timestep_spacing', 'prediction_type', 'rescale_betas_zero_snr'} was not found in config. Values will be initialized to default values. {'scaling_factor', 'latents_mean', 'use_quant_conv', 'latents_std', 'shift_factor', 'force_upcast', 'mid_block_add_attention', 'use_post_quant_conv'} was not found in config. Values will be initialized to default values. {'transformer_layers_per_block', 'mid_block_type', 'addition_time_embed_dim', 'encoder_hid_dim_type', 'time_cond_proj_dim', 'dual_cross_attention', 'projection_class_embeddings_input_dim', 'num_attention_heads', 'reverse_transformer_layers_per_block', 'time_embedding_act_fn', 'mid_block_only_cross_attention', 'addition_embed_type', 'use_linear_projection', 'num_class_embeds', 'encoder_hid_dim', 'only_cross_attention', 'resnet_time_scale_shift', 'time_embedding_dim', 'cross_attention_norm', 'time_embedding_type', 'addition_embed_type_num_heads', 'conv_out_kernel', 'attention_type', 'dropout', 'class_embeddings_concat', 'timestep_post_act', 'class_embed_type', 'conv_in_kernel', 'upcast_attention', 'resnet_skip_time_act', 'resnet_out_scale_factor'} was not found in config. Values will be initialized to default values. Resolving data files: 100% 18/18 [00:00<00:00, 150094.38it/s] Generating train split: 9 examples [00:00, 377.47 examples/s] Traceback (most recent call last): File "/content/train_text_to_image_lora.py", line 979, in main() File "/content/train_text_to_image_lora.py", line 625, in main raise ValueError( ValueError: --image_column' value 'image' needs to be one of: text Traceback (most recent call last): File "/usr/local/bin/accelerate", line 8, in sys.exit(main()) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 48, in main args.func(args) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1097, in launch_command simple_launcher(args) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 703, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_text_to_image_lora.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--dataset_name=/content/drive/MyDrive/Newfolder', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--checkpointing_steps=100', '--learning_rate=1e-4', '--report_to=wandb', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=500', '--validation_prompt=forward trajectory', '--validation_epochs=50', '--seed=0', '--push_to_hub']' returned non-zero exit status 1.

i also had a dependency issue and think this error is related to that.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. cudf-cu12 24.4.1 requires pyarrow<15.0.0a0,>=14.0.1, but you have pyarrow 17.0.0 which is incompatible. ibis-framework 8.0.0 requires pyarrow<16,>=2, but you have pyarrow 17.0.0 which is incompatible. ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. datasets 2.21.0 requires pyarrow>=15.0.0, but you have pyarrow 14.0.1 which is incompatible.

pyarrow has conflicting versions for cudf-cu12 24.4.1 ibis-framework 8.0.0 datasets 2.21.0

Reproduction

!pip install git+https://github.com/huggingface/diffusers #!pip install accelerate !pip install -r https://raw.githubusercontent.com/huggingface/diffusers/main/examples/text_to_image/requirements.txt !pip install pyarrow==14.0.1

!accelerate config default

%env MODEL_NAME=runwayml/stable-diffusion-v1-5 %env INSTANCE_DIR=/content/drive/MyDrive/Newfolder %env HF_ENDPOINT=https://hf-mirror.com

!accelerate launch train_text_to_image_lora.py
--pretrained_model_name_or_path=$MODEL_NAME
--dataset_name=$INSTANCE_DIR
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1
--checkpointing_steps=100
--learning_rate=1e-4
--report_to="wandb"
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="forward trajectory"
--validation_epochs=50
--seed="0"
--push_to_hub

Logs

No response

System Info

  • 🤗 Diffusers version: 0.31.0.dev0
  • Platform: Linux-6.1.85+-x86_64-with-glibc2.35
  • Running on Google Colab?: Yes
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.3.1+cu121 (True)
  • Flax version (CPU?/GPU?/TPU?): 0.8.4 (gpu)
  • Jax version: 0.4.26
  • JaxLib version: 0.4.26
  • Huggingface_hub version: 0.23.5
  • Transformers version: 4.42.4
  • Accelerate version: 0.32.1
  • PEFT version: 0.7.0
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.4
  • xFormers version: not installed
  • Accelerator: Tesla T4, 15360 MiB
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@sayakpaul

to add my dataset folder has image.png and image.txt

Screenshot 2024-08-18 095610

SoumyaMB10 avatar Aug 18 '24 08:08 SoumyaMB10

Can you provide a more minimal reproducible snippet?

sayakpaul avatar Aug 18 '24 10:08 sayakpaul

%env MODEL_NAME=runwayml/stable-diffusion-v1-5 %env INSTANCE_DIR=/content/drive/MyDrive/Newfolder %env HF_ENDPOINT=https://hf-mirror.com/

!accelerate launch train_text_to_image_lora.py --pretrained_model_name_or_path=$MODEL_NAME --dataset_name=$INSTANCE_DIR --resolution=512 --train_batch_size=1 --gradient_accumulation_steps=1 --checkpointing_steps=100 --learning_rate=1e-4 --report_to="wandb" --lr_scheduler="constant" --lr_warmup_steps=0 --max_train_steps=500 --validation_prompt="forward trajectory" --validation_epochs=50 --seed="0" --push_to_hub

SoumyaMB10 avatar Aug 18 '24 10:08 SoumyaMB10

the error occurred while running above snippet
error - ValueError: --image_column' value 'image' needs to be one of: text

SoumyaMB10 avatar Aug 18 '24 10:08 SoumyaMB10

Above is a script not a snippet.

sayakpaul avatar Aug 18 '24 10:08 sayakpaul

%env MODEL_NAME=runwayml/stable-diffusion-v1-5 %env INSTANCE_DIR=/content/drive/MyDrive/Newfolder %env HF_ENDPOINT=https://hf-mirror.com

!accelerate launch train_text_to_image_lora.py
--pretrained_model_name_or_path=$MODEL_NAME
--dataset_name=$INSTANCE_DIR
--push_to_hub

error Traceback (most recent call last): File "/content/train_text_to_image_lora.py", line 979, in main() File "/content/train_text_to_image_lora.py", line 625, in main raise ValueError( ValueError: --image_column' value 'image' needs to be one of: text

SoumyaMB10 avatar Aug 18 '24 11:08 SoumyaMB10

@sayakpaul I have added the minimal snippet required to reproduce the error, please advice

SoumyaMB10 avatar Aug 19 '24 09:08 SoumyaMB10

Thay is not a minimal code snippet, that is a training command. By a minimal code snippet I meant something like following:

from datasets import load_dataset 

dataset = load_dataset(my_directory, metadata="...")

sayakpaul avatar Aug 19 '24 10:08 sayakpaul

  1. i did not use load_dataset, called dataset directly in the training command.
  2. if I run the above minimal code snippet. here is the output Screenshot 2024-08-19 121602

SoumyaMB10 avatar Aug 19 '24 11:08 SoumyaMB10

any comments on this @sayakpaul ?

SoumyaMB10 avatar Aug 21 '24 08:08 SoumyaMB10

Maybe your dataset has columns named as "image", "label" it does not read the the metadata you provided I see from this:https://github.com/huggingface/diffusers/issues/6445

mj-x avatar Aug 29 '24 02:08 mj-x

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 22 '24 15:09 github-actions[bot]

您发给我的信件已经收到,非常感谢您的来信,我将尽快给您回复。This is an automatic reply, confirming that your e-mail was received.Thank you.

mj-x avatar Sep 22 '24 15:09 mj-x

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Oct 17 '24 15:10 github-actions[bot]

I still facing that error with correct column names: 'image', 'text' and datasets 2.4.0. Can anyone suggest how to solve?

alfakat avatar Nov 27 '24 08:11 alfakat

您发给我的信件已经收到,非常感谢您的来信,我将尽快给您回复。This is an automatic reply, confirming that your e-mail was received.Thank you.

mj-x avatar Nov 27 '24 08:11 mj-x

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Dec 21 '24 15:12 github-actions[bot]

您发给我的信件已经收到,非常感谢您的来信,我将尽快给您回复。This is an automatic reply, confirming that your e-mail was received.Thank you.

mj-x avatar Dec 21 '24 15:12 mj-x

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jan 15 '25 15:01 github-actions[bot]

您发给我的信件已经收到,非常感谢您的来信,我将尽快给您回复。This is an automatic reply, confirming that your e-mail was received.Thank you.

mj-x avatar Jan 15 '25 15:01 mj-x

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Feb 10 '25 15:02 github-actions[bot]

您发给我的信件已经收到,非常感谢您的来信,我将尽快给您回复。This is an automatic reply, confirming that your e-mail was received.Thank you.

mj-x avatar Feb 10 '25 15:02 mj-x

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Mar 08 '25 15:03 github-actions[bot]

您发给我的信件已经收到,非常感谢您的来信,我将尽快给您回复。This is an automatic reply, confirming that your e-mail was received.Thank you.

mj-x avatar Mar 08 '25 15:03 mj-x

what should be the format of our dataset when taking the data from locally?

jsuj1th avatar Apr 17 '25 17:04 jsuj1th

您发给我的信件已经收到,非常感谢您的来信,我将尽快给您回复。This is an automatic reply, confirming that your e-mail was received.Thank you.

mj-x avatar Apr 17 '25 17:04 mj-x

what should be the format of our dataset when taking the data from locally?

https://huggingface.co/docs/datasets/en/image_load#local-files -- would this guide be helpful?

sayakpaul avatar Apr 18 '25 03:04 sayakpaul