Windows files are written in CRLF newlines
Given a file on Windows with LF newlines, when I run it through apply-refact, it comes out with CRLF newlines. It would be better if apply-refact (or maybe ghc-exactprint?) could preserve whatever type of newline I have, in addition to everything else.
The fix to this should be done in ghc-exactprint, since parseModule and exactPrint don't round-trip wrt LF vs. CRLF.
I wonder if GHC ParseResult contains the newline encoding information, or is it lost? I don't see it in ParseResult.
If that information is lost, perhaps we can post-process the output, depending on whether the first newline in the input file is LF or CRLF?
If that information is lost, perhaps we can post-process the output, depending on whether the first newline in the input file is LF or CRLF?
Mmm, ideally ghc-exactprint should preserve the EOL for each line and honour the previous EOL line if each additional line. Nobody sane would mix both line endings but who knows. Other alternative could be add as a a new parameter all info that may be lost in the actual data, line endings (and encoding?) and apply it to output uniformly, in a new function to keep backwards compatibility. Then apply-refact could decide what EOL is used (the first line f.e.) or ask clients for the same info.
I agree with
Nobody sane would mix both line endings
I don't mind adding a new function with an additional parameter, but I guess users rarely want to specify whether they want LF or CRLF. They just want whatever the input file has. So I'd add it only if someone says they need it.
As to encoding, I don't think GHC can even parse any non-UTF-8 encoded source file, so we can safely assume that all source files are UTF-8.
Oh, you probably meant that the new function takes the already parsed module as input. In that case adding a new parameter is definitely useful (although ideally, that information should be part of Anns).
How about adding it to applyRefactorings'? Can you think of other things like this besides LF vs. CRLF?
Well, taking a look to System.IO there are three axis in the output settings (for completeness):
- binary vs text encoding: subsumed within the other ones, it will be always text encoding for us.
- encoding: we could assume it always is UTF-8 as ghc only supports it.
- eol: agree adding a paremeter as we have commented here seems to be the right solution (at least in
applyRefactoring')