editorconfig-core-py icon indicating copy to clipboard operation
editorconfig-core-py copied to clipboard

New fnmatch.py

Open rakus opened this issue 6 years ago • 1 comments

This pull request is opened as a draft, as it is not mergeable yet. It contains a switch to switch between different implementations for number ranges. After a decision is made on the issue https://github.com/editorconfig/editorconfig/issues/371, the code needs cleanup to only support one implementation. (If it is of interest at all.)


This pull request proposes a new implementation for the translation of editorconfig glob expressions to python regular expressions.

IMO the following points are important:

This was initially implemented in VimScript for my Vim plugin and was than ported to Python.

Numerical Ranges

This implementation translates numerical ranges into regular expressions.

E.g.

  • {3..10} becomes (?:\+?(?:[3-9]|10))
  • {10..3} also becomes (?:\+?(?:[3-9]|10)), so the order of numbers is irrelevant
  • {-3..+3} becomes (?:-(?:[0-3])|\+?(?:[0-3]))

The special thing about the implementation of numeric ranges is that it is switchable between different implementations. See the top-level variable NUMBER_MODE in fnmatch.py.

Mode AS_IS

This implementation should work like the current implementation.

Mode ZEROS

This implementation allows any number of leading zeros, as proposed by @cxw42 in Py core: numeric ranges don't handle zero correctly.

So: {3..10} becomes (?:\+?0*(?:[3-9]|10)) and would match

  • 3
  • +3
  • `0000003'
  • `+0003'

Mode JUSTIFIED

This implementation handles numerical ranges as done by bash. I proposed this in a comment to @cxw42 issue here.

Now

  • {3..10} becomes (?:[3-9]|10), so leading + is not matched anymore.
  • {03..10} becomes (?:0[3-9]|10), so all numbers are formatted to equal width. IN this case single-digit numbers need one leading zero.
  • {03..120} becomes (?:00[3-9]|0[1-9][0-9]|1[0-1][0-9]|120). Again the numbers are formatted to equal width, here three digits. So single-digit numbers need two leading zeros, double-digit numbers one.
  • For negative numbers, the leading minus sign is part of the width calculation. So {-3..03} matches -3 and 03, but not -03.

Status

I didn't change anything outside fnmatch.py. So the tests failing with the master branch still fail with this branch. The only on that is fixed is brackets_slash_inside4.

The implementation passes all current tests related to globbing in mode AS_IS and JUSTIFIED. For ZEROS one test fails, that test require leading zeros not to be matched.

Locally I added some tests for mode 'JUSTIFIED', that I could provide also.

There is one function (unescapeBrackets) where I'm unsure if this is really correct.

rakus avatar May 06 '19 20:05 rakus

If you're still working on this, would you be willing to check if the new fnmatch handles https://github.com/editorconfig/editorconfig-vim/issues/205 ?

cxw42 avatar Jan 15 '23 02:01 cxw42