Unable to parse with a given grammar
I am trying to figure out how Earley parser, esp Leo Joop's optimization works, and found your implementation. While looking at it, I think I found a bug.
Here is my grammar
d0 = Terminal('0')
d1 = Terminal('1')
d2 = Terminal('2')
d3 = Terminal('3')
d4 = Terminal('4')
d5 = Terminal('5')
d6 = Terminal('6')
d7 = Terminal('7')
d8 = Terminal('8')
d9 = Terminal('9')
plus = Terminal('+')
minus = Terminal('-')
div = Terminal('/')
mul = Terminal('*')
open_ = Terminal('(')
close_ = Terminal(')')
dot = Terminal('.')
start = Nonterminal('start')
expr = Nonterminal('expr')
term = Nonterminal('term')
factor = Nonterminal('factor')
integer = Nonterminal('integer')
digit = Nonterminal('digit')
grammar = [
Rule(start, [expr]),
Rule(expr, [term, plus,expr]),
Rule(expr, [term, minus,expr]),
Rule(expr, [term]),
Rule(factor, [plus, factor]),
Rule(factor, [minus, factor]),
Rule(factor, [open_, expr, close_]),
Rule(factor, [integer]),
Rule(factor, [integer, dot, integer]),
Rule(integer, [digit, integer]),
Rule(integer, [digit]),
Rule(digit, [d0]),
Rule(digit, [d1]),
Rule(digit, [d2]),
Rule(digit, [d3]),
Rule(digit, [d4]),
Rule(digit, [d5]),
Rule(digit, [d6]),
Rule(digit, [d7]),
Rule(digit, [d8]),
Rule(digit, [d9]),
]
accept = start
terminals = {
"0": d0,
"1": d1,
"2": d2,
"3": d3,
"4": d4,
"5": d5,
"6": d6,
"7": d7,
"8": d8,
"9": d9,
"+": plus,
"-": minus,
"/": div,
"*": mul,
"(": open_,
")": close_,
".": dot,
}
user_grammar = grammar
And here is the output
python chartparser.py
Traceback (most recent call last):
File "cp.py", line 464, in <module>
main()
File "cp.py", line 82, in main
parser.step(terminals[token], token)
File "cp.py", line 142, in step
for eim, bb in self.chart[location-1][term]:
KeyError: T'3'
First of all, it's been a while I've heard back from someone trying my software. Thank you a lot for writing this up.
There may be bugs in here and not just in the Leo item handling. I will study my own code a bit and see what would cause this. I will also write some tooling to help people looking into this work.
I'll write back later this week.
@vrthra I added a small utility that visualizes grammars. I put your grammar into the example_1.py.
The grammar was incomplete so I added a missing rule that made it work. The preprocessing step could warn from that of course.
This thing doesn't have whole from Joop's right recursion optimization rule btw. I guess I'll yet go and document the chartparser, then add that one in. But that'll happen next week.
@cheery Thank you! much appreciated.