isofun comments

Results 8 comments of


                                            isofun

Lost storage location at start up : lost all notes

i use boostnote next and lost all my notes after reinstall

Lost storage location at start up : lost all notes

> I have modified my post to add the version number : I use the 0.8.2 version > Thanks @Rokt33r says they delete the old cloud storage, so there is...

After enabling tensor parallelism (tp-size=2), there is no response

I also meet similar problem. With tp=2, the log stop at: [2025-03-07 17:58:29 TP0] Prefill batch. #new-seq: 1, #new-token: 7, #cached-token: 0, token usage: 0.00, #running-req: 0, #queue-req: 0, But...

Hugging Face Demo挂了

How to deal with unknown tokens in practice?

> Thanks for your interest in our work and for the compliments! > > We can in fact incorporate item-side features besides item id in the GR formulation without increasing...

图片成功上传到图床，但md中的链接为g

俺也一样

损失很大

> > @massquantity 我发现你有把求平均注释掉，可为什么我得到的效果还是这么差，so sad > > 我跑这个程序的时候用DPPG的方法跑的结果确实不好，但是你试试用BCQ的方法。我不知道说得对不对，我觉得是因为所有数据都是在离线训练的，用DDPG算法的时候也没有说用训练好的策略去收集新数据再训练，整个代码的运行过程都是离线训练，所以用BCQ的方法得到的结果会好一些。我也在做这个方向，或许可以交流一下我试了一下bcq，但是actor_loss一下子就变成了绝对值很大的负值，不知道是不是我实现的有问题。。。

损失很大

感谢回复！后续我发现是我自己的数据集没有加is_end的标记导致的，但是修复之后损失依旧会持续的缓慢上涨，请问对于这个问题你有什么看法吗？在 2024-06-07 14:30:25，"FFFFlint" ***@***.***> 写道：应该是训练稳定性的问题吧，可以试试把policy delay加大一点（默认是1，也就是每经过一个step都会和critic一起更新），比如设置为4，这样actor更新的就慢一点，如果更新太频繁可能会因为critic还没有train得太好而导致actor的loss很大~ — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>