Clarify glob matching
Taking the specification literally, the glob *.py would not match foo/bar.py since foo/bar includes the slash character. The C library for editorconfig adds **/ to the start of the glob if it does not have a slash in it (see editorconfig.c:260), but this should really be in the spec.
I agree. However, I don't think this is about *. bar.py will also match foo/bar.py. The issue sounds like not clarifying that if / is present, then it would match the final component of the file path.
One question is: Does bar/baz.py match foo/bar/baz.py? That's something the spec isn't clear about.
It seems our specification is "nearer" to the known behaviour of glob than the c implementation.
The implicit expansion with **/ fixes to the matching of the last part while the original "glob" command would apply the pattern to each part of the path and our spec suggests a behaviour alike (with less words).
https://en.wikipedia.org/wiki/Glob_(programming) https://man7.org/linux/man-pages/man7/glob.7.html
For comparision, gitignore's semantics are, in relevant part:
If there is a separator at the beginning or middle (or both) of the pattern, then the pattern is relative to the directory level of the particular .gitignore file itself. Otherwise the pattern may also match at any level below the .gitignore level.
If we followed that, glob bar/baz.py would not apply to file foo/bar/baz.py.
I looked at editorconfig-core-test.
-
glob/does not appear to have any test cases for this. -
filetree/path_separator.inappears to have a test for this:nested_path_separatortries to matchnested/path/separatoragainst globpath/separator. The expectation is that it does not match.:
given# Tests path separator match below top of path new_ec_test(nested_path_separator path_separator.in nested/path/separator "^[ \t\n\r]*$")path_separator.incontents:[path/separator] key1=value1
@cxw42 Thanks for the awesome summary -- Now looks like what's unclear in the spec is:
Section names in EditorConfig files are filepath globs, similar to the format accepted by
.gitignore.
We should make this a bit more formal somehow, but it seems tricky to formulate proper language. I'm thinking maybe it's easier to clarify in the * row after all.
Here's an initial draft --- it's not great, so please respond with improvements :)
Section names ... accepted by .gitignore.
They support pattern matching through Unix shell-style wildcards.
+
+A section name that does not contain a `/` matches the last component
+of the path at any depth below the EditorConfig file.
+E.g., `[*.py]` matches `foo.py` and `subdir/bar.py`.
+A section name that does contain a `/` matches that section name
+relative to the directory containing that EditorConfig file.
+E.g., `[dir/*.py]` matches `dir/foo.py` but not `dir/subdir/foo.py`.
+
These filepath globs recognize the following as special characters for wildcard matching:
(Separately, I realized we should also add tests and spec stating that a section name may not end with a /. Edit that is now open at https://github.com/editorconfig/editorconfig/issues/493)
The draft looks good to me!
I just realized this overlaps quite a bit with https://github.com/editorconfig/editorconfig/issues/509 . @pommicket would you please check and see if the current spec and tests permit this issue to be closed?
Yes it looks good now