fairseq2 icon indicating copy to clipboard operation
fairseq2 copied to clipboard

Log chosen/rejected entropy

Open jacklanchantin opened this issue 11 months ago • 0 comments

What does this PR do? Please describe:

  • Adds logging entropy for chosen and rejected sequences separately in online DPO training.
  • Few other small changes

Check list:

  • [x] Was the content of this PR discussed and approved via a GitHub issue? (no need for typos or documentation improvements)
  • [ ] Did you read the contributor guideline?
  • [ ] Did you make sure that your PR does only one thing instead of bundling different changes together?
  • [ ] Did you make sure to update the documentation with your changes? (if necessary)
  • [ ] Did you write any new necessary tests?
  • [ ] Did you verify new and existing tests pass locally with your changes?
  • [ ] Did you update the CHANGELOG? (no need for typos, documentation, or minor internal changes)

jacklanchantin avatar May 01 '25 21:05 jacklanchantin