Brace expansion edge case
QuickCheck found this in this job
Tests
Properties
brace expansion: FAIL (0.08s)
*** Failed! Assertion failed (after 56 tests):
"\\}\\,,{y}{\\{\\ }\\}{}\\}s\\ p\\{\\},\\,lhhiq\\,qv}{}\\},}\\}\\ \\,{\\}\\ \\{a\\ \\,\\,z\\,t"
Use --quickcheck-replay '55 TFGenR 0000000309618D3500000000000F4240000000000000E1390000000056F692C0 0 72057594037927935 56 0' to reproduce.
Found another one, in this job: https://travis-ci.org/pbiggar/language-bash/jobs/161995727
brace expansion: FAIL (0.10s)
*** Failed! Assertion failed (after 69 tests):
"a,\\}}{a\\{,w\\}}{}l}{{xaky\\{\\ \\,\\,tuqmg}}z\\{,}asik{ng\\,"
Use --quickcheck-replay '68 TFGenR 1BCE489E03A1DF8763BEBE7128281A981FBC02700490D72F1B17CB17703ECB89 0 31 5 0' to reproduce.
From a failed test case I worked out a minimal example: "{a,b}{},c}"
Bash expands this to a{},c} b{},c}.
Our expansion is a} ac b} bc.
I suspect the following:
Bash scans the expression and expands {a,b}. Since {} is not a valid brace expansion it's left untouched and since the remaining ,c} lacks an opening brace it's considered to be part of the postscript, too. Hence Bash expands {a,b} with postscript {},c}.
We interpret {},c} as a brace expansion.
Interestingly {},c} (without the {a,b}) yields the same in both systems.
Can anyone confirm that?
After skimming the source code I think Bash works like I wrote in the last comment:
- It reads the preamble: The part until an opening brace with a matching closing brace.
- It reads until the matching closing brace.
- It looks at the text between the braces: If it contains a ',' expand it with "normal" brace expansion, if not try sequence expansion.
- If none succeeds treat the amble as normal text and proceed with the string following it (if there's any).
The interesting part of the source.
That seems right. However bash expands a{},c} to a} ac (which we do as well) so there's more to it.
I think this is the rule we don't implement right: https://git.savannah.gnu.org/cgit/bash.git/tree/braces.c#n710 To get this to work in parsec, we may need to keep state to be able to look back at the last character.
Actually maybe not, in {a,b}{},c} there's no whitespace that precedes the {}