[TW-1669] annotations: escape the escape character
Daniel Shahaf on 2015-08-25T07:51:23Z says:
In 2.4.5 (2331810), it is not possible to add the literal string \n (two bytes) to an annotation using task annotate:
% task 1 annotate 'Lorem \n ipsum \\n dolor \\\n sit
amet'
% task _get 1.annotations.1.description
Lorem
ipsum
dolor \
sit
amet
This is a regression from 2.4.4, where the output was:
Lorem
ipsum \n dolor \
sit
amet
The latter is what I would expect.
Migrated metadata:
Created: 2015-08-25T07:51:23Z
Modified: 2017-01-16T17:59:07Z
I'd like to look into this one, seems interesting.
What I have found out so far:
- The problem seems to be that the input is processed too often. i.e. "\\n" becomes "\n" but then is processed again to a newline.
- This happens in Lexer::readWord
- Specifically in the block starting with
// an escaped thing - Lexer::readWord has four definitions, two of which reside in taskwarriors Lexer.cpp, the other two in the shared libraries'.
- The behavior is correct if the escaped-thing-block is commented out in all implementations
- Specifically in the block starting with
This raises some questions:
- Why are there four implementations of the same function, none of which call any other but contain a lot of similar code?
- Why is the input to the annotate command processed so often?
edit: Markdown gobbled one of the backslashes in the first bullet point. Escape characters are tricky ;)
After running a small test program through gdb a few times, I found out that the shell already parses "\\n" to "\n" when reading in argv. This can be easily checked by running echo "\\n" and echo "\n" and observing that they print the same characters. Therefore, both "\" and "\\" are indistinguishable to the c++ code. This explains that "\\n" is interpreted as a newline, as the escaped-thing-codeblock only ever sees one backslash.
Sadly I don't see a way for taskwarrior to fix this issue, as we can't controll the users shell or it's escape character handling.
This leaves people who want to annotate their tasks with "\n" with the option of using task 42 annotate "a \\\nice annotation"
the shell already parses "\n" to "\n" when reading in argv ... Therefore, both "" and "\" are indistinguishable to the c++ code.
Not following how the shell's treatment of something relates to the C++, but to the shell '\\n' will be quite different from '\n' whereas use of double quotes will add an extra escape pass that makes them the same. Also note that you could use $'' to encode newlines directly, eg:
$ echo $'this\ntest'
this
test
Well the C++ can't really distinguish between \ and \\ if they are already escaped by the shell, can it? Your point with the difference between single and double quotes is an important one that I have not taken into account though. Thank you @smemsh for broadening my shell knowledge.