language-bash icon indicating copy to clipboard operation
language-bash copied to clipboard

Brace expansion edge case

Open knrafto opened this issue 9 years ago • 5 comments

QuickCheck found this in this job

Tests
  Properties
    brace expansion:     FAIL (0.08s)
      *** Failed! Assertion failed (after 56 tests): 
      "\\}\\,,{y}{\\{\\ }\\}{}\\}s\\ p\\{\\},\\,lhhiq\\,qv}{}\\},}\\}\\ \\,{\\}\\ \\{a\\ \\,\\,z\\,t"
      Use --quickcheck-replay '55 TFGenR 0000000309618D3500000000000F4240000000000000E1390000000056F692C0 0 72057594037927935 56 0' to reproduce.

knrafto avatar Sep 26 '16 04:09 knrafto

Found another one, in this job: https://travis-ci.org/pbiggar/language-bash/jobs/161995727

 brace expansion:     FAIL (0.10s)
      *** Failed! Assertion failed (after 69 tests): 
      "a,\\}}{a\\{,w\\}}{}l}{{xaky\\{\\ \\,\\,tuqmg}}z\\{,}asik{ng\\,"
      Use --quickcheck-replay '68 TFGenR 1BCE489E03A1DF8763BEBE7128281A981FBC02700490D72F1B17CB17703ECB89 0 31 5 0' to reproduce.

pbiggar avatar Sep 26 '16 19:09 pbiggar

From a failed test case I worked out a minimal example: "{a,b}{},c}" Bash expands this to a{},c} b{},c}. Our expansion is a} ac b} bc.

I suspect the following: Bash scans the expression and expands {a,b}. Since {} is not a valid brace expansion it's left untouched and since the remaining ,c} lacks an opening brace it's considered to be part of the postscript, too. Hence Bash expands {a,b} with postscript {},c}. We interpret {},c} as a brace expansion. Interestingly {},c} (without the {a,b}) yields the same in both systems.

Can anyone confirm that?

mmhat avatar Dec 04 '19 00:12 mmhat

After skimming the source code I think Bash works like I wrote in the last comment:

  1. It reads the preamble: The part until an opening brace with a matching closing brace.
  2. It reads until the matching closing brace.
  3. It looks at the text between the braces: If it contains a ',' expand it with "normal" brace expansion, if not try sequence expansion.
  4. If none succeeds treat the amble as normal text and proceed with the string following it (if there's any).

The interesting part of the source.

mmhat avatar Dec 04 '19 01:12 mmhat

That seems right. However bash expands a{},c} to a} ac (which we do as well) so there's more to it.

I think this is the rule we don't implement right: https://git.savannah.gnu.org/cgit/bash.git/tree/braces.c#n710 To get this to work in parsec, we may need to keep state to be able to look back at the last character.

knrafto avatar Dec 14 '19 18:12 knrafto

Actually maybe not, in {a,b}{},c} there's no whitespace that precedes the {}

knrafto avatar Dec 14 '19 18:12 knrafto