Chengyue Jiang comments

Results 18 comments of


                                            Chengyue Jiang

OK, 那我等等

I tried different configurations but always failed to output meaningful results, wondering if I made some mistake or code was not ready.

我试过，支持的，直接peft的merge_and_unload就行

Rope is much easier to support Flash-attention, could you please provide code that support flash attention training using alibi?

If possible please provide a pytorch version of the vision model.

我也有卡住不动的情况，给个参考，我是4卡DiT-S没问题，DiT-XL就卡住了

另外补充，GPU利用率长时间是0。但是我从DiT-XL/2 改成 DiT-XL/8 就不卡了，看来是显存炸了

May be you trained the DiT-S/8 by default?

我跑的 "DiT-S/8"，用的默认训练脚本，训练了2/3个epoch的时候，sample的结果基本是随机

比较random 没啥意义，不过之前毕竟用了最小的模型，也只训练以一个epoch情有可原吧。我试试更大的模型以及其他参数再看吧。