FLASK
FLASK copied to clipboard
Regarding instance-specific scoring criteria
Hi, In the paper it is mentioned that "instance-specific" scoring criteria was created for the FLASK-HARD subset. Is there any way to create or use the subquestions/scoring criteria . It would be very nice if there would be a way to access them and benchmark models on it.
Thanks