Adam Lesnikowski issues

Repositories
Issues
Comments

Results 2 issues of


                                            Adam Lesnikowski

How are evals done on trained models?

Thanks for putting this together. I am wondering how are evals done on trained models. Are there some third-party evaluation libraries that you use to measure trained model performance/metric, or...

Full pref dataset available?

Thanks for making all this available, it's been really great to see! From Fig 7 p. 22 in your Tulu 3 pdf, are the four responses and LLM judge scores...