LLMforScience

Results 5 issues of LLMforScience

1. At iteration 0, the p_{\theta_0}=p_{SFT} and the global optimal p_{\theta_1} after iteration 1 of following objecitve will still be p_{SFT} . Thus, the following iterations of p_{\theta} will always...

Hi Shengchao, Thank you very much for your great work! In your paper, you mentioned that "Yet, after confirming with the authors, certain mismatches exist between the 2D topologies and...

Hi, Thank you for your code. Could you please relase the code graph tokenizer training ?

Hi, Thank you very much for your work! Could you please relase your model checkponts such as SFT model and Reward model for each experimments in Huggingface?

Hi, I use the following command to run the code CUDA_VISIBLE_DEVICES=3,5,6 ACCELERATE_LOG_LEVEL=info accelerate launch --config_file accelerate_configs/deepspeed_zero3.yaml scripts/run.py training_configs/mistral-7b-base-simpo.yaml I found that it occupies 70GB per A100 card, even when the...