math icon indicating copy to clipboard operation
math copied to clipboard

train data has an example without a { } for box

Open brando90 opened this issue 1 year ago • 2 comments

data point

data/MATH/train/algebra/24014.json

has string:

{
    "problem": "What is the largest value of $x$, if $\\frac{x}{5} + \\frac{1}{5x} = \\frac{1}{2}$?",
    "level": "Level 3",
    "type": "Algebra",
    "solution": "We multiply both sides of the equation by $10x$ to clear the fractions, leaving us with $2x^2 + 2 = 5x$. Rearranging the terms, we have $2x^2 - 5x + 2 = 0$. We can now solve for $x$ by factoring: $(2x - 1)(x - 2) = 0$. We could also use the quadratic formula:  $$x = \\frac{5 \\pm \\sqrt{(-5)^2 - 4(2)(2)}}{4}.$$Either way, we find that $x = 1/2$ or $x = 2$. Since we want the largest value of $x$, our answer is $\\boxed 2$."
}

but it should be:

{
    "problem": "What is the largest value of $x$, if $\\frac{x}{5} + \\frac{1}{5x} = \\frac{1}{2}$?",
    "level": "Level 3",
    "type": "Algebra",
    "solution": "We multiply both sides of the equation by $10x$ to clear the fractions, leaving us with $2x^2 + 2 = 5x$. Rearranging the terms, we have $2x^2 - 5x + 2 = 0$. We can now solve for $x$ by factoring: $(2x - 1)(x - 2) = 0$. We could also use the quadratic formula:  $$x = \\frac{5 \\pm \\sqrt{(-5)^2 - 4(2)(2)}}{4}.$$Either way, we find that $x = 1/2$ or $x = 2$. Since we want the largest value of $x$, our answer is $\\boxed{2}$."
}

brando90 avatar Mar 30 '24 04:03 brando90

also:

'25040.json'

correct str

{
    "problem": "If $(x + y)^2 = 1$ and $xy = -4$, what is the value of $x^2 + y^2$?",
    "level": "Level 3",
    "type": "Algebra",
    "solution": "We see that $(x + y)^2 = (x^2 + y^2) + 2xy = 1$. We want to find $x^2 + y^2$ and are given $xy = -4$. So, $x^2 + y^2 + 2xy = x^2 + y^2 + 2(-4) = 1$. It follows that $x^2 + y^2 = \\boxed{9}$."
}

brando90 avatar Mar 30 '24 05:03 brando90

This is a known issue. The Minerva paper and EleutherAI lm eval harness both have fixes for it (minor tweak to the boxed answer extraction code): https://github.com/EleutherAI/lm-evaluation-harness/blob/ab7cc6b155c75627b35ab1afe26f8bc9bbff38ea/lm_eval/tasks/minerva_math/utils.py#L99

brando90 avatar Mar 30 '24 06:03 brando90