Mixed precision type: no
2024-04-12 15:04:25,859 - modelscope - INFO - Use user-specified model revision: v2.0
{'variance_type', 'sample_max_value', 'clip_sample_range', 'dynamic_thresholding_ratio', 'rescale_betas_zero_snr', 'thresholding'} was not found in config. Values will be initialized to default values.
/opt/conda/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.get(instance, owner)()
{'force_upcast'} was not found in config. Values will be initialized to default values.
{'attention_type', 'dropout', 'reverse_transformer_layers_per_block'} was not found in config. Values will be initialized to default values.
Traceback (most recent call last):
File "/mnt/workspace/facechain/facechain/train_text_to_image_lora.py", line 1224, in
main()
File "/mnt/workspace/facechain/facechain/train_text_to_image_lora.py", line 789, in main
dataset = load_dataset("imagefolder", data_dir=args.dataset_name)
File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 2523, in load_dataset
builder_instance = load_dataset_builder(
File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 2195, in load_dataset_builder
dataset_module = dataset_module_factory(
File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 1736, in dataset_module_factory
).get_module()
File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 1119, in get_module
patterns = sanitize_patterns(self.data_files) if self.data_files is not None else get_data_patterns(base_path)
File "/opt/conda/lib/python3.10/site-packages/datasets/data_files.py", line 475, in get_data_patterns
raise EmptyDatasetError(f"The directory at {base_path} doesn't contain any data files") from None
datasets.data_files.EmptyDatasetError: The directory at /mnt/workspace/facechain/worker_data/qw/training_data/ly261666/cv_portrait_model/person-1_labeled doesn't contain any data files
Traceback (most recent call last):
File "/opt/conda/bin/accelerate", line 8, in
sys.exit(main())
File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1023, in launch_command
simple_launcher(args)
File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/launch.py", line 643, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/bin/python', '/mnt/workspace/facechain/facechain/train_text_to_image_lora.py', '--pretrained_model_name_or_path=ly261666/cv_portrait_model', '--revision=v2.0', '--sub_path=film/film', '--output_dataset_name=/mnt/workspace/facechain/worker_data/qw/training_data/ly261666/cv_portrait_model/person-1', '--caption_column=text', '--resolution=512', '--random_flip', '--train_batch_size=1', '--num_train_epochs=200', '--checkpointing_steps=5000', '--learning_rate=1.5e-04', '--lr_scheduler=cosine', '--lr_warmup_steps=0', '--seed=42', '--output_dir=/mnt/workspace/facechain/worker_data/qw/ly261666/cv_portrait_model/person-1', '--lora_r=4', '--lora_alpha=32', '--lora_text_encoder_r=32', '--lora_text_encoder_alpha=32', '--resume_from_checkpoint=fromfacecommon']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
output = await route_utils.call_process_api(
File "/opt/conda/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
output = await app.get_blocks().process_api(
File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
result = await self.call_function(
File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1185, in call_function
prediction = await anyio.to_thread.run_sync(
File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 661, in wrapper
response = f(*args, **kwargs)
File "/mnt/workspace/facechain/app.py", line 804, in run
train_lora_fn(base_model_path=base_model_path,
File "/mnt/workspace/facechain/app.py", line 207, in train_lora_fn
raise gr.Error("训练失败 (Training failed)")
gradio.exceptions.Error: '训练失败 (Training failed)'
I had the same problem, Did you solve it?
please try out the newest train-free, 10s inference version facechain-fact.