chartparser icon indicating copy to clipboard operation
chartparser copied to clipboard

Unable to parse with a given grammar

Open vrthra opened this issue 7 years ago • 3 comments

I am trying to figure out how Earley parser, esp Leo Joop's optimization works, and found your implementation. While looking at it, I think I found a bug.

Here is my grammar

    d0 = Terminal('0')
    d1 = Terminal('1')
    d2 = Terminal('2')
    d3 = Terminal('3')
    d4 = Terminal('4')
    d5 = Terminal('5')
    d6 = Terminal('6')
    d7 = Terminal('7')
    d8 = Terminal('8')
    d9 = Terminal('9')
    plus = Terminal('+')
    minus = Terminal('-')
    div = Terminal('/')
    mul = Terminal('*')
    open_ = Terminal('(')
    close_ = Terminal(')')
    dot = Terminal('.')
    start = Nonterminal('start')
    expr = Nonterminal('expr')
    term = Nonterminal('term')
    factor = Nonterminal('factor')
    integer = Nonterminal('integer')
    digit = Nonterminal('digit')

    grammar = [
        Rule(start, [expr]),
        Rule(expr, [term, plus,expr]),
        Rule(expr, [term, minus,expr]),
        Rule(expr, [term]),
        Rule(factor, [plus, factor]),
        Rule(factor, [minus, factor]),
        Rule(factor, [open_, expr, close_]),
        Rule(factor, [integer]),
        Rule(factor, [integer, dot, integer]),
        Rule(integer, [digit, integer]),
        Rule(integer, [digit]),
        Rule(digit, [d0]),
        Rule(digit, [d1]),
        Rule(digit, [d2]),
        Rule(digit, [d3]),
        Rule(digit, [d4]),
        Rule(digit, [d5]),
        Rule(digit, [d6]),
        Rule(digit, [d7]),
        Rule(digit, [d8]),
        Rule(digit, [d9]),
    ]

    accept = start
    terminals = {
            "0": d0,
            "1": d1,
            "2": d2,
            "3": d3,
            "4": d4,
            "5": d5,
            "6": d6,
            "7": d7,
            "8": d8,
            "9": d9,
            "+": plus,
            "-": minus,
            "/": div,
            "*": mul,
            "(": open_,
            ")": close_,
            ".": dot,
            }

    user_grammar = grammar

And here is the output

python chartparser.py 
Traceback (most recent call last):
  File "cp.py", line 464, in <module>
    main()
  File "cp.py", line 82, in main
    parser.step(terminals[token], token)
  File "cp.py", line 142, in step
    for eim, bb in self.chart[location-1][term]:
KeyError: T'3'

vrthra avatar Oct 24 '18 12:10 vrthra

First of all, it's been a while I've heard back from someone trying my software. Thank you a lot for writing this up.

There may be bugs in here and not just in the Leo item handling. I will study my own code a bit and see what would cause this. I will also write some tooling to help people looking into this work.

I'll write back later this week.

cheery avatar Oct 24 '18 12:10 cheery

@vrthra I added a small utility that visualizes grammars. I put your grammar into the example_1.py.

The grammar was incomplete so I added a missing rule that made it work. The preprocessing step could warn from that of course.

This thing doesn't have whole from Joop's right recursion optimization rule btw. I guess I'll yet go and document the chartparser, then add that one in. But that'll happen next week.

cheery avatar Oct 28 '18 22:10 cheery

@cheery Thank you! much appreciated.

vrthra avatar Oct 29 '18 07:10 vrthra