How to read and write OsPaths without interpreting them?
I'm trying to read and write lists of OsPaths (actually just PosixPaths in case that matters) to files. I want to avoid doing any conversion or interpretation if possible---just treat the paths as opaque bytestrings separated by \NUL.
I see that I could use encodeFS and decodeFS, but 1) that's incompatible with Attoparsec (annoyingly, the Parser monad isn't a transformer), 2) it forces IO into a lot of otherwise pure code, and 3) the extra round-trip seems more likely to introduce encoding bugs than prevent them.
I'm about to try breaking into the hidden modules and using the raw constructors. But is there a more recommended way to read/write PosixPaths?
One idea that comes to mind is adding a Binary/Bytable instance? I haven't looked into that before. But a trivial instance that just wraps/unwraps the constructor seems like it would be equivalent to exposing the constructor itself.
Edit: also, thanks for taking on this OsPath thing! I'm not well versed in low level encodings and am glad someone is working on it. I would offer to help to the extent I can without breaking anything. I'm working on Arbitrary instances to check that my code can round-trip trees of OsPaths to folders on disk. Maybe a version of those could end up in the library and help identify bugs?
Of course after posting this, I finally noticed you can access the raw constructors in the OsString package! Is that what I should be doing?
From what I understand you want to write filepaths to a file on disk?
Indeed I would avoid decodeFS. How to access the raw bytes in a cross platform manner is described here: https://hasufell.github.io/posts/2022-06-29-fixing-haskell-filepaths.html#accessing-the-raw-bytes-in-a-cross-platform-manner
I haven't looked into that before. But a trivial instance that just wraps/unwraps the constructor seems like it would be equivalent to exposing the constructor itself.
The problem is that we are dealing with wide char array on windows ([Word16]) as opposed to char array on unix ([Word8]). So you'd still somehow need to encode the platform information (maybe as a magic bit?) for OsPath. Binary instances for PosixPath and WindowsPath are indeed trivial. So if you're just dealing with PosixPath, you can unwrap the underlying ShortByteString and turn it into a ByteString.
Wrt attoparsec, also see https://github.com/haskell/attoparsec/issues/225
My idea was to provide a way to convert to Data.Bytes.Bytes (which is a sliceable type) and then use that for efficient parsing. But we still have the problem that on Windows we are dealing with wide char arrays.
Of course after posting this, I finally noticed you can access the raw constructors in the OsString package! Is that what I should be doing?
Yes
Related: https://github.com/haskell/filepath/issues/161