Tianyu(Vincent) Zhan issues

Repositories
Issues
Comments

Results 3 issues of


                                            Tianyu(Vincent) Zhan

Issue about import pcdet.datasets.kitti.kitti_dataset

when trying to import kitti, it shows that RuntimeError: cannot statically infer the expected size of a list in this context: : Traceback (most recent call last): File "/home/tz2693/LCDNet/pcdet_test.py", line...

Clarification on Reward Usage in DPO Training

In the RLHF workflow paper, the Reward Model is used to annotate new data generated by the LLM during the iterative DPO process, resulting in scalar values. According to Algorithm...

Reward-KL Comparison

### Question about KL Divergence Evaluation in DPO Implementation I read the paper ["Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint"](link_to_paper) and noticed your...