scripts icon indicating copy to clipboard operation
scripts copied to clipboard

find-dupes.awk: Trouble with certain non-alphanumeric characters

Open cfiske opened this issue 2 years ago • 0 comments

Hi, great tool but I've run into issues with a few filenames. I am running it on Debian Linux. My version of awk is mawk 1.3.4 20200120.

Rathen than a one-liner I separate it out. First: ls -lR --full-time . > example.output Then: awk -f find-dupes.awk example.output

So far the cases I've discovered where it breaks are:

  1. Parentheses () or brackets [] with multiple words inside (space-separated). Example line: -rwx-w----. 13 backup backup 12742 2020-03-16 22:56:07.488254000 +0000 (My filename).blah Error:
awk: run time error: regular expression compile failed (missing ')')
(My
        FILENAME="example.output" FNR=3 NR=3

By contrast, (Myfile).blah does not error.

  1. Filename starts with +. Example line: -rwx-w----. 13 backup backup 8192 2020-03-16 22:56:07.488254000 +0000 +Filename.txt Error:
awk: run time error: regular expression compile failed (missing operand)
+Filename.txt
        FILENAME="example.output" FNR=3 NR=3

Using My+Filename.txt does not error.

Without knowing much about awk, my uneducated guess is that these characters need to be escaped on input somehow?

cfiske avatar Sep 30 '23 21:09 cfiske