Chengyue Jiang

Results 18 comments of Chengyue Jiang

I tried different configurations but always failed to output meaningful results, wondering if I made some mistake or code was not ready.

我试过,支持的,直接peft的merge_and_unload就行

Rope is much easier to support Flash-attention, could you please provide code that support flash attention training using alibi?

If possible please provide a pytorch version of the vision model.

我也有卡住不动的情况,给个参考,我是4卡DiT-S没问题,DiT-XL就卡住了

另外补充,GPU利用率长时间是0。 但是我从DiT-XL/2 改成 DiT-XL/8 就不卡了,看来是显存炸了

May be you trained the DiT-S/8 by default?

我跑的 "DiT-S/8",用的默认训练脚本,训练了2/3个epoch的时候,sample的结果基本是随机

比较random 没啥意义,不过之前毕竟用了最小的模型,也只训练以一个epoch情有可原吧。 我试试更大的模型以及其他参数再看吧。