Liin L
Liin L
Hi, I have some questions about these key bindings. I found there are some `gv-gv`s in ``` -- Visual Block -- -- Move text up and down keymap("x", "J", ":move...
> I tried to generate a universal checkpoint for bloomz-7b on 8 x a100 40G using the following method. > > 1. using deepspeed_to_deepspeed.py, convert the data parallel of the...
> `7*(8+4)=84` - so you should have a 84GB universal checkpoint. (8+4 optim states+weights) > > Can you check with `du -s` where do you get the bloat? > >...
> @LuciusMos Thank you! By the way, for others using BLOOM, I advice add 1e-7 to difference of two sentences' reward, it will help you avoid `inf` loss in training...
> > @LiinXemmon Hi, this is caused by log(0) which will return `inf`, I think you should a very small value to difference of two sentences' reward(like 1e-7), it will...