simple-evals icon indicating copy to clipboard operation
simple-evals copied to clipboard

Fix MMLU answer extraction regex for repeated "Answer: LETTER" pattern

Open lucasresck opened this issue 1 year ago • 0 comments

Description

This pull request addresses issue https://github.com/openai/simple-evals/issues/33 by fixing the regular expression used to extract answers from model outputs with MMLU.

Solution

The existing regex fails to handle cases where the "Answer: LETTER" pattern appears multiple times. This is resolved by:

  • Using re.findall: Instead of re.search, re.findall is used to find all occurrences of the answer pattern.
  • Selecting the last match: The last match from the re.findall results is taken as the correct answer.
  • Allowing overlapping matches: The regex pattern is adjusted to allow overlapping matches, using a capturing group inside a lookahead.

lucasresck avatar Dec 10 '24 18:12 lucasresck