FoldFlow icon indicating copy to clipboard operation
FoldFlow copied to clipboard

Inquiry About Code Release and Implementation Details for Reinforced Fine-Tuning (ReFT)

Open SuperCarryDFY opened this issue 10 months ago • 0 comments

Hello,

First of all, great work on this project! I have a couple of questions regarding the implementation details of Reinforced Fine-Tuning (ReFT):

  • Code Availability: Are there any plans to release the code for ReFT? It would be incredibly helpful for reproducibility and further research.
  • Implementation Details: Could you clarify how log p_{\theta}(x|a) is calculated in your approach? Specifically, is the methodology similar to the implementation in DDPO?

Thank you for your time and contributions! Looking forward to your response.

SuperCarryDFY avatar Apr 07 '25 14:04 SuperCarryDFY