Suggestion: Publish original text of jgex_ag_231.txt
I would like to finetune an LLM to automatically parse problem statements. With 300 examples I think it should be possible. If you publish the original statements I could try it and see how far I can get. My end goal is to build a simple web app that ingests natural language statements and outputs the geometry drawing solution. The first step is, of course, the parsing. So I would be very thankful if you published the original statements.
Note that the translation is not very easy. It seems like you have to rewrite the statements to eliminate all the constraints in favour of construction. E.g., you cannot write "Let AB=CD" for existing points A, B, C, D. For AlphaGeometry, you must convert it to something like "Let D be a point such that AB=CD".
Yes, I understand is not a trivial task. But having that set of examples and finetuning GPT-4 with chain of thought and some very detailed system instructions could work. The data is crucial to make it work, although it is not the only part to make it work. You need to add all the little nuances as context, and all the definitions as context too.
https://artofproblemsolving.com/wiki/index.php/2000_IMO_Problems/Problem_1
Two circles $G_1$ and $G_2$ intersect at two points $M$ and $N$. Let $AB$ be the line tangent to these circles at $A$ and $B$, respectively, so that $M$ lies closer to $AB$ than $N$. Let $CD$ be the line parallel to $AB$ and passing through the point $M$, with $C$ on $G_1$ and $D$ on $G_2$. Lines $AC$ and $BD$ meet at $E$; lines $AN$ and $CD$ meet at $P$; lines $BN$ and $CD$ meet at $Q$. Show that $EP=EQ$.
translated_imo_2000_p1 [note this is the first one in imo_ag_30.txt]
a b = segment a b; g1 = on_tline g1 a a b; g2 = on_tline g2 b b a; m = on_circle m g1 a, on_circle m g2 b; n = on_circle n g1 a, on_circle n g2 b; c = on_pline c m a b, on_circle c g1 a; d = on_pline d m a b, on_circle d g2 b; e = on_line e a c, on_line e b d; p = on_line p a n, on_line p c d; q = on_line q b n, on_line q c d ? cong e p e q
I would like to finetune an LLM to automatically parse problem statements. With 300 examples I think it should be possible. If you publish the original statements I could try it and see how far I can get. My end goal is to build a simple web app that ingests natural language statements and outputs the geometry drawing solution. The first step is, of course, the parsing. So I would be very thankful if you published the original statements.
May be you can find some of them here http://www.mmrc.iss.ac.cn/~xgao/paper/book-area.pdf