zig icon indicating copy to clipboard operation
zig copied to clipboard

ReadLine & Peek

Open IridescentRose opened this issue 3 years ago • 3 comments

I'd like to introduce a new method readLine and readLineAlloc (suggestions on a name would be appreciated) to the Reader interface. These functions help fulfill the outlined changes suggested in #6754 and #3822 for platform-agnostic line reading. These functions will treat a variety of cases of CR, LF, VT and FF as valid line terminators and consume all that happen at the end of a line.

However... to implement these, the reader must be able to look forward and attempt to read the next character without committing to consuming that data. This is where Reader introduces an optional function on construction peekFn. For the most part, this change allows reading buffer data ahead of time without committing to consuming it. This was necessary for CR+LF and other edge case line termination formats.

In addition, I then created peek equivalents of readByte, readStruct, readInt, readIntBig, readIntLittle ...

If a Reader does not have Peek defined, the peek functions peek peekAll return 0 and peekNoEof and functions that rely on it return error.EndOfStream. If a Reader does not have Peek defined, then the functionsreadLine and readLineAlloc return then upon finding one endline character.

This is just a draft for now, but I'd be willing to discuss and make changes.

I'll be testing the new peek() functions and the readLine() alongside making sure I haven't broken test cases where peek is ignored. This will however force a change in the API for those who instantiate Reader() directly or extend upon it.

Additionally:

  • [x] FIFO Queues now have peekItemNext() to grab the next available item
  • [x] Buffered Readers now have peek functionality
  • [x] GZip streams have peek functionality
  • [ ] ZLib streams have peek functionality
  • [x] File streams have peek functionality
  • [ ] BitReader streams have peek functionality
  • [x] CountingReader streams have peek functionality
  • [x] FixedBufferStream has peek functionality
  • [x] LimitedReader streams have peek functionality
  • [x] StreamSource have peek functionality enabled
  • [x] x.net.tcp has peek functionality #7194
  • [ ] Get Rid of PeekStream #4501
  • [ ] UEFI Peek()
  • [ ] PDB Peek()

Tests:

  • [x] FIFO Queue peekItemNext() test
  • [ ] Buffered Reader peek() test (I think I made an error with this)
  • [ ] GZip peek() test
  • [ ] ZLib peek() test
  • [ ] File peek() test
  • [ ] BitReader peek() test
  • [ ] CountingReader peek() test
  • [ ] FixedBufferStream peek() test
  • [ ] LimitedReader peek() test
  • [ ] StreamSource peek() test
  • [ ] x.net.tcp peek() test
  • [ ] readLine() test
  • [ ] readLineAlloc() test

IridescentRose avatar Nov 04 '22 04:11 IridescentRose

I think this would make more sense in conjunction with std.io.SeekableStream or something adjacent, instead of changing the definition of std.io.Reader.

InKryption avatar Nov 04 '22 04:11 InKryption

Yeah, I too had concerns when doing this, considering changing Reader() ended up providing a lot of trouble with compiling and just fixing things to use the new form. I can reduce the scope of this change, but it wouldn't really fix #6754 outside of SeekableStream because of CRLF vs CR line endings. How do you know that the next character after CR is LF or some other data?

IridescentRose avatar Nov 04 '22 05:11 IridescentRose

One idea would be to have a function that takes both a seekable stream and a reader, which are assumed to be from the same source, and use functions of each in conjunction to peek & read.

InKryption avatar Nov 04 '22 05:11 InKryption

Closing old draft.

andrewrk avatar Dec 12 '22 03:12 andrewrk