sys.stdin.read() throws a TypeError when stdin is set to be non-blocking
Bug report
Bug description:
sys.stdin.read() throws a TypeError if stdin has been set to be non-blocking. The code below should just exit without issue. It throws a TypeError if no input is provided.
#!/usr/bin/python3
import sys
import os
os.set_blocking(sys.stdin.fileno(), False)
sys.stdin.read()
CPython versions tested on:
3.11
Operating systems tested on:
Linux
Linked PRs
- gh-121739
Please assign this issue to me
This would be a behaviour change so would only go into 3.13. @benjaminp and @gpshead are listed as experts for IO, so copying them for thoughts.
A
I agree that read() on a non-blocking file object should never raise TypeError. The internal None return means no data, So it should be treated as such. The draft PR causes said None to also be returned by the non-blocking TextIOWrapper.read() instead of raising a TypeError from within the internals. This is more consistent.
People expecting that not to happen at all won't like the TypeErrors at the level of their own code, by putting stdin into non-blocking mode they signed up for that... the error now comes directly from their code and could be handled without a try: except: by checking the return value rather than from within our internals where it seems questionable.
Do note that typeshed says io.TextIOBase.read only ever returns -> str rather than -> str | None... Changing that accurately reflect what can happen in the face of the unusual situation non-blocking IO being used on a text file could have transitive typing consequences. The majority of code in the world should never be forced to check for None from a TextIO read API because it'll always be in blocking mode. But the way blocking mode is entered/exited on underlying files means there isn't much to be done about that from a type declaration point of view. typeshed could continue to ignore None there in the return annotation for practical reasons if it chooses.
If people want non blocking IO I really wish they'd avoid layering things like text mode wrappers on top of it, or go full-on async for such things.
As for why .read() shouldn't just return "" instead of None... An empty string return typically means "end of file". But we don't actually state if that holds true or not for TextIOBase in https://docs.python.org/3/library/io.html#io.TextIOBase.read, other higher level wrappers that may be related do have a non-blocking carve out for an empty return though: https://docs.python.org/3/library/io.html#io.BufferedReader.read in particular says "Read and return size bytes, or if size is not given or negative, until EOF or if the read call would block in non-blocking mode."
So having it return an empty string could remain consistent return type wise and within a reasonable set of expectations for something in non-blocking mode.
Per my note on https://discuss.python.org/t/handling-sys-stdin-read-in-non-blocking-mode/59633/6, lets go with raising a BlockingIOError. Not returning None.
I’ve created a new pull request with the required changes to TextIOWrapper, as well as a new test. The next step is to document these changes. I’ve looked into the documentation to find the best place to add this information, but my search hasn’t been conclusive.
I would appreciate some guidance on where to document this change, as I’m not entirely sure of the best approach. If someone else is available to handle the documentation, that would also be perfectly fine.
For documentation, I think a note in the I/O docs TextIO section and BufferedIO would help people understand the behavior better from just the docs without having to go lookup the Python 3 I/O PEP. Something around that passing or changing a file descriptor which to be non-blocking is unsupported, and that generally the standard library will try to raise a NonBlockingIO error if non-blocking I/O is encountered. It might also make sense to add a similar note specifically around read
That documentation is generated from the file: https://github.com/python/cpython/blob/main/Doc/library/io.rst
Thanks for the suggestion! I've added notes to the I/O docs to clarify how BlockingIOError can pop up when working with non-blocking streams. The new notes should help users understand this behavior without needing to dive into the Python 3 I/O PEP.
I hope these changes hit the mark and make things clearer for everyone!
Just dropping a friendly ping here. This issue has a pull request that is in good shape and has undergone intensive review. Could a maintainer kindly take a look for final review and, if everything looks good, consider merging it?
Merged. Thank you for your perserverance, @giosiragusa!