python-arpa
python-arpa copied to clipboard
important parser error
https://github.com/sfischer13/python-arpa/blob/2284b815866aeb08f65f786da416e78d7937ee1d/arpa/parsers/quick.py#L21
The regular expression has an error here. Consider the case where the line is:
-2.310726 maybe when 9.609759e-05
The exponent in the backoff weight is not correctly parsed -- the e-05 will be missed.
Wrt the correct version, there should be an extra bracket/group in the expression. Like this: "(\t ( (-?\d+(\.\d+)?)([eE]-?\d+)? ) )?$"