New fnmatch.py
This pull request is opened as a draft, as it is not mergeable yet. It contains a switch to switch between different implementations for number ranges. After a decision is made on the issue https://github.com/editorconfig/editorconfig/issues/371, the code needs cleanup to only support one implementation. (If it is of interest at all.)
This pull request proposes a new implementation for the translation of editorconfig glob expressions to python regular expressions.
IMO the following points are important:
- better handling of escaped characters (see Tests for escaped special characters in glob expressions.
- find matching brace or bracket (handles
}xyz{, see Test globs with braces that are back to back. - for numerical ranges, regular expressions are created, no additional arithmetic compare in a second step needed. More details about that below.
This was initially implemented in VimScript for my Vim plugin and was than ported to Python.
Numerical Ranges
This implementation translates numerical ranges into regular expressions.
E.g.
-
{3..10}becomes(?:\+?(?:[3-9]|10)) -
{10..3}also becomes(?:\+?(?:[3-9]|10)), so the order of numbers is irrelevant -
{-3..+3}becomes(?:-(?:[0-3])|\+?(?:[0-3]))
The special thing about the implementation of numeric ranges is that it is
switchable between different implementations. See the top-level variable NUMBER_MODE in fnmatch.py.
Mode AS_IS
This implementation should work like the current implementation.
Mode ZEROS
This implementation allows any number of leading zeros, as proposed by @cxw42 in Py core: numeric ranges don't handle zero correctly.
So: {3..10} becomes (?:\+?0*(?:[3-9]|10)) and would match
-
3 -
+3 - `0000003'
- `+0003'
Mode JUSTIFIED
This implementation handles numerical ranges as done by bash. I proposed this in a comment to @cxw42 issue here.
Now
-
{3..10}becomes(?:[3-9]|10), so leading+is not matched anymore. -
{03..10}becomes(?:0[3-9]|10), so all numbers are formatted to equal width. IN this case single-digit numbers need one leading zero. -
{03..120}becomes(?:00[3-9]|0[1-9][0-9]|1[0-1][0-9]|120). Again the numbers are formatted to equal width, here three digits. So single-digit numbers need two leading zeros, double-digit numbers one. - For negative numbers, the leading minus sign is part of the width calculation. So
{-3..03}matches-3and03, but not-03.
Status
I didn't change anything outside fnmatch.py. So the tests failing with the master branch still fail with this branch. The only on that is fixed is brackets_slash_inside4.
The implementation passes all current tests related to globbing in mode AS_IS and JUSTIFIED. For ZEROS one test fails, that test require leading zeros not to be matched.
Locally I added some tests for mode 'JUSTIFIED', that I could provide also.
There is one function (unescapeBrackets) where I'm unsure if this is really correct.
If you're still working on this, would you be willing to check if the new fnmatch handles https://github.com/editorconfig/editorconfig-vim/issues/205 ?