Try to optimize the expression parser
Some benchmarks are needed to ensure what the problem is, but I’m pretty sure (given the current nom-3 implementation) that we have a lot of failures and retries.
I have observed that nested parens will significantly slow down the parser.
void main ()
{
(((((((1.0)))))));
}
Parsing the above file with the release build takes about 10 seconds on iMac.
Interestingly, there seems to be no problem when parsing nested function calls:
void main ()
{
foo(foo(foo(foo(foo(foo(foo(1.0)))))));
}
Parsing the above snippet takes less than 10 ms on the same machine.
I've been exploring parsing GLSL expressions with the LALRPOP parser generator, combined with Logos as the lexer generator. Building the same AST is around 1300x (not a typo) faster with a lalrpop/logos than the current nom combinators:
Finished bench [optimized] target(s) in 0.07s
Parse expression/lalrpop
time: [3.9253 us 3.9293 us 3.9337 us]
Parse expression/glsl time: [5.1026 ms 5.1166 ms 5.1302 ms]
This should be taken with a grain of salt, since this is a very simple test (the nested_parentheses one), and running on my branch for #129. I'll see if I can validate those improvements over some other examples and against upstream, but it might be interesting to consider rewriting the parser.
Considering the other issues I've been working on, especially #129, adding location information to the AST would require changing a lot in the parser definition, so we might as well rewrite the parser with lalrpop and support span information at the same time.