facechain icon indicating copy to clipboard operation
facechain copied to clipboard

modelscope notebook训练失败

Open mashagua opened this issue 1 year ago • 1 comments

Mixed precision type: no

2024-04-12 15:04:25,859 - modelscope - INFO - Use user-specified model revision: v2.0 {'variance_type', 'sample_max_value', 'clip_sample_range', 'dynamic_thresholding_ratio', 'rescale_betas_zero_snr', 'thresholding'} was not found in config. Values will be initialized to default values. /opt/conda/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)() {'force_upcast'} was not found in config. Values will be initialized to default values. {'attention_type', 'dropout', 'reverse_transformer_layers_per_block'} was not found in config. Values will be initialized to default values. Traceback (most recent call last): File "/mnt/workspace/facechain/facechain/train_text_to_image_lora.py", line 1224, in main() File "/mnt/workspace/facechain/facechain/train_text_to_image_lora.py", line 789, in main dataset = load_dataset("imagefolder", data_dir=args.dataset_name) File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 2523, in load_dataset builder_instance = load_dataset_builder( File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 2195, in load_dataset_builder dataset_module = dataset_module_factory( File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 1736, in dataset_module_factory ).get_module() File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 1119, in get_module patterns = sanitize_patterns(self.data_files) if self.data_files is not None else get_data_patterns(base_path) File "/opt/conda/lib/python3.10/site-packages/datasets/data_files.py", line 475, in get_data_patterns raise EmptyDatasetError(f"The directory at {base_path} doesn't contain any data files") from None datasets.data_files.EmptyDatasetError: The directory at /mnt/workspace/facechain/worker_data/qw/training_data/ly261666/cv_portrait_model/person-1_labeled doesn't contain any data files Traceback (most recent call last): File "/opt/conda/bin/accelerate", line 8, in sys.exit(main()) File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main args.func(args) File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1023, in launch_command simple_launcher(args) File "/opt/conda/lib/python3.10/site-packages/accelerate/commands/launch.py", line 643, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/opt/conda/bin/python', '/mnt/workspace/facechain/facechain/train_text_to_image_lora.py', '--pretrained_model_name_or_path=ly261666/cv_portrait_model', '--revision=v2.0', '--sub_path=film/film', '--output_dataset_name=/mnt/workspace/facechain/worker_data/qw/training_data/ly261666/cv_portrait_model/person-1', '--caption_column=text', '--resolution=512', '--random_flip', '--train_batch_size=1', '--num_train_epochs=200', '--checkpointing_steps=5000', '--learning_rate=1.5e-04', '--lr_scheduler=cosine', '--lr_warmup_steps=0', '--seed=42', '--output_dir=/mnt/workspace/facechain/worker_data/qw/ly261666/cv_portrait_model/person-1', '--lora_r=4', '--lora_alpha=32', '--lora_text_encoder_r=32', '--lora_text_encoder_alpha=32', '--resume_from_checkpoint=fromfacecommon']' returned non-zero exit status 1. Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction output = await route_utils.call_process_api( File "/opt/conda/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api result = await self.call_function( File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1185, in call_function prediction = await anyio.to_thread.run_sync( File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread return await future File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run result = context.run(func, *args) File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 661, in wrapper response = f(*args, **kwargs) File "/mnt/workspace/facechain/app.py", line 804, in run train_lora_fn(base_model_path=base_model_path, File "/mnt/workspace/facechain/app.py", line 207, in train_lora_fn raise gr.Error("训练失败 (Training failed)") gradio.exceptions.Error: '训练失败 (Training failed)'

mashagua avatar Apr 12 '24 07:04 mashagua

I had the same problem, Did you solve it?

Yancy10-1 avatar May 20 '24 01:05 Yancy10-1

please try out the newest train-free, 10s inference version facechain-fact.

sunbaigui avatar Jun 04 '24 09:06 sunbaigui