heyzude comments

Repositories
Issues
Comments

Results 4 comments of


                                            heyzude

It seems that there is a +/- sign typo

Sorry. Your implementation is right. My statement above is wrong.

issue about the implementation of log data likelihood

See tf.reduce_sum() at line 462 of bmaml.py. I think the author's implementation is correct.

DPO code

Hi, and thanks for sharing your work! Could you elaborate more on which specific part of Megatraon-LM (https://github.com/NVIDIA/Megatron-LM) you used?

_broadcast_to_vllm DOES NOT seem to update the vllm weights

No I mean, the code does not seem to update the vllm weight in proper way.