trlx
trlx copied to clipboard
How to train LLaMA2 on the summarize_rlhf example?
How to train LLaMA2 on the summarize_rlhf example?
I tried to modify the original code [1] and a different notebook [2], but failed due to an out of memory error. [1] https://github.com/CarperAI/trlx/blob/main/examples/summarize_rlhf/trlx_gptj_text_summarization.py [2] https://github.com/HumanSignal/RLHF/blob/master/tutorials/RLHF_with_Custom_Datasets.ipynb
Do you have any success stories with LLaMA2?