Chengyue Jiang
Chengyue Jiang
OK, 那我等等
I tried different configurations but always failed to output meaningful results, wondering if I made some mistake or code was not ready.
我试过,支持的,直接peft的merge_and_unload就行
Rope is much easier to support Flash-attention, could you please provide code that support flash attention training using alibi?
If possible please provide a pytorch version of the vision model.
我也有卡住不动的情况,给个参考,我是4卡DiT-S没问题,DiT-XL就卡住了
另外补充,GPU利用率长时间是0。 但是我从DiT-XL/2 改成 DiT-XL/8 就不卡了,看来是显存炸了
May be you trained the DiT-S/8 by default?
我跑的 "DiT-S/8",用的默认训练脚本,训练了2/3个epoch的时候,sample的结果基本是随机
比较random 没啥意义,不过之前毕竟用了最小的模型,也只训练以一个epoch情有可原吧。 我试试更大的模型以及其他参数再看吧。