parsec icon indicating copy to clipboard operation
parsec copied to clipboard

Documentation regarding updatePosChar does not match the function's behaviour

Open Pharap opened this issue 4 years ago • 1 comments

The documentation claims that:

If the character is a newline ('\n') or carriage return ('\r') the line number is incremented by 1.

(Specified here.)

But the actual definition does not do any special handling of '\r', thus leaving it to fall back to the default case behaviour of increasing the column number by 1.

https://github.com/haskell/parsec/blob/38dfc545874dd44c26382d8dd692eb533396c6f5/src/Text/Parsec/Pos.hs#L115-L120

I was initially going to submit a PR rectifying the comment (for which I prepared a commit), but I started wondering whether it's the comment that's wrong (i.e. that '\t' isn't supposed to behave that way) or the code itself (i.e. that '\t' should be specially handled and it isn't), so I decided to raise an issue first as a precaution.

Pharap avatar Mar 25 '21 01:03 Pharap

Hello,

To my understanding, it is the behavior that is wrong. In my opinion, the character \t should not be treated as a special case.

In the current implementation, when it encounters the \t character, it increments the column number by 8. The size or “number of spaces” of the tab character depends on the configuration of the application being used to edit or display the input texts, but the tab character always counts as a single character.

Right now, I am using the position obtained during parsing to generate diagnostic error messages. If the line contains a tab character at the beginning, the position of the error message is incorrectly shifted 7 positions to the right.

In the case of the \r character, I don't know if it should be treated as a special case. The current behavior with the \n character seems to be sufficient and work properly.

I don't know if anyone else has this problem. I can prepare a commit with the proposed change.

parraman avatar Dec 28 '24 07:12 parraman