verifiers
verifiers copied to clipboard
Verifiers for LLM Reinforcement Learning
Everything was doing great on training, for at least 90 steps, and then this error appeared, any idea on how I can prevent complete failure when theres connection issues? This...
## Description ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ]...
## Description ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ]...
## Description When running `vf-eval sentence-repeater -n 1 -r 1`, I get the error > ... > File ".../verifiers/environments/sentence_repeater/sentence_repeater.py", line 88, in env_response > "content": state["info"]["questions"][state["turn"]], > ~~~~~^^^^^^^^ > File...
## Description ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ]...
## Description Added two parameters `include_ids` and `exclude_ids` to select which ids to include from the dataset and which ids to exclude from the dataset respectively. Fixes #581 ## Type...
## Description ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking...
## Description Should not be merged. ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds...
@daniel can add some color here as well, but the gist of it is not clear how to handle scaffolds/ datasets, e.g. for SWE environments we have * scaffolds: mini...