prepare tokenizer
update token length: 225
Use DreamBooth method.
prepare images.
found directory train\nijika\6_nijika contains 42 image files
252 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 1
resolution: (1024, 1024)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: False
[Subset 0 of Dataset 0]
image_dir: "train\nijika\6_nijika"
image_count: 42
num_repeats: 6
shuffle_caption: True
keep_tokens: 0
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
is_reg: False
class_tokens: nijika
caption_extension: .txt
[Dataset 0]
loading image sizes.
100%|█████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 419.76it/s]
make buckets
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (1024, 1024), count: 126
mean ar error (without repeats): 0.0
prepare accelerator
Using accelerator 0.15.0 or above.
load StableDiffusion checkpoint
loading u-net: <All keys matched successfully>
loading vae: <All keys matched successfully>
loading text encoder: <All keys matched successfully>
Replace CrossAttention.forward to use xformers
[Dataset 0]
caching latents.
100%|██████████████████████████████████████████████████████████████████████████████████| 21/21 [00:36<00:00, 1.73s/it]
import network module: networks.lora
create LoRA network. base dim (rank): 32, alpha: 32.0
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
use AdamW optimizer | {}
override steps. steps for 10 epochs is / 指定エポックまでのステップ数: 1260
running training / 学習開始
num train images * repeats / 学習画像の数×繰り返し回数: 252
num reg images / 正則化画像の数: 0
num batches per epoch / 1epochのバッチ数: 126
num epochs / epoch数: 10
batch size per device / バッチサイズ: 1
gradient accumulation steps / 勾配を合計するステップ数 = 1
total optimization steps / 学習ステップ数: 1260
steps: 0%| | 0/1260 [00:00<?, ?it/s]epoch 1/10
Traceback (most recent call last):
File "K:\lora-scripts\sd-scripts\train_network.py", line 699, in
train(args)
File "K:\lora-scripts\sd-scripts\train_network.py", line 538, in train
noise_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
File "K:\lora-scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "K:\lora-scripts\venv\lib\site-packages\accelerate\utils\operations.py", line 490, in call
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "K:\lora-scripts\venv\lib\site-packages\torch\amp\autocast_mode.py", line 12, in decorate_autocast
return func(*args, **kwargs)
File "K:\lora-scripts\venv\lib\site-packages\diffusers\models\unet_2d_condition.py", line 381, in forward
sample, res_samples = downsample_block(
File "K:\lora-scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "K:\lora-scripts\venv\lib\site-packages\diffusers\models\unet_2d_blocks.py", line 612, in forward
hidden_states = attn(hidden_states, encoder_hidden_states=encoder_hidden_states).sample
File "K:\lora-scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "K:\lora-scripts\venv\lib\site-packages\diffusers\models\attention.py", line 216, in forward
hidden_states = block(hidden_states, context=encoder_hidden_states, timestep=timestep)
File "K:\lora-scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "K:\lora-scripts\venv\lib\site-packages\diffusers\models\attention.py", line 484, in forward
hidden_states = self.attn1(norm_hidden_states) + hidden_states
File "K:\lora-scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "K:\lora-scripts\sd-scripts\library\train_util.py", line 1700, in forward_xformers
out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None) # 最適なのを選んでくれる
File "K:\lora-scripts\venv\lib\site-packages\xformers\ops.py", line 865, in memory_efficient_attention
return op.apply(query, key, value, attn_bias, p).reshape(output_shape)
File "K:\lora-scripts\venv\lib\site-packages\xformers\ops.py", line 319, in forward
out, lse = cls.FORWARD_OPERATOR(
File "K:\lora-scripts\venv\lib\site-packages\torch_ops.py", line 143, in call
return self._op(*args, **kwargs or {})
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
steps: 0%| | 0/1260 [01:11<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\Administrator.WIN-7JDG5CD1HHH\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Administrator.WIN-7JDG5CD1HHH\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "K:\lora-scripts\venv\Scripts\accelerate.exe_main.py", line 7, in
File "K:\lora-scripts\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "K:\lora-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "K:\lora-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['K:\lora-scripts\venv\Scripts\python.exe', './sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=./sd-models/model.ckpt', '--train_data_dir=./train/nijika', '--output_dir=./output', '--logging_dir=./logs', '--resolution=1024,1024', '--network_module=networks.lora', '--max_train_epochs=10', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=aki', '--train_batch_size=1', '--save_every_n_epochs=2', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1337', '--cache_latents', '--clip_skip=2', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=ckpt', '--min_bucket_reso=256', '--max_bucket_reso=1024', '--xformers', '--shuffle_caption']' returned non-zero exit status 1.
Train finished