cpython icon indicating copy to clipboard operation
cpython copied to clipboard

sys.stdin.read() throws a TypeError when stdin is set to be non-blocking

Open MartinHHProbst opened this issue 2 years ago • 2 comments

Bug report

Bug description:

sys.stdin.read() throws a TypeError if stdin has been set to be non-blocking. The code below should just exit without issue. It throws a TypeError if no input is provided.

#!/usr/bin/python3

import sys
import os

os.set_blocking(sys.stdin.fileno(), False)
sys.stdin.read()

CPython versions tested on:

3.11

Operating systems tested on:

Linux

Linked PRs

  • gh-121739

MartinHHProbst avatar Sep 17 '23 19:09 MartinHHProbst

Please assign this issue to me

ankur1-samsung avatar Oct 06 '23 10:10 ankur1-samsung

This would be a behaviour change so would only go into 3.13. @benjaminp and @gpshead are listed as experts for IO, so copying them for thoughts.

A

AA-Turner avatar Oct 06 '23 15:10 AA-Turner

I agree that read() on a non-blocking file object should never raise TypeError. The internal None return means no data, So it should be treated as such. The draft PR causes said None to also be returned by the non-blocking TextIOWrapper.read() instead of raising a TypeError from within the internals. This is more consistent.

People expecting that not to happen at all won't like the TypeErrors at the level of their own code, by putting stdin into non-blocking mode they signed up for that... the error now comes directly from their code and could be handled without a try: except: by checking the return value rather than from within our internals where it seems questionable.

Do note that typeshed says io.TextIOBase.read only ever returns -> str rather than -> str | None... Changing that accurately reflect what can happen in the face of the unusual situation non-blocking IO being used on a text file could have transitive typing consequences. The majority of code in the world should never be forced to check for None from a TextIO read API because it'll always be in blocking mode. But the way blocking mode is entered/exited on underlying files means there isn't much to be done about that from a type declaration point of view. typeshed could continue to ignore None there in the return annotation for practical reasons if it chooses.

If people want non blocking IO I really wish they'd avoid layering things like text mode wrappers on top of it, or go full-on async for such things.

gpshead avatar Jul 31 '24 18:07 gpshead

As for why .read() shouldn't just return "" instead of None... An empty string return typically means "end of file". But we don't actually state if that holds true or not for TextIOBase in https://docs.python.org/3/library/io.html#io.TextIOBase.read, other higher level wrappers that may be related do have a non-blocking carve out for an empty return though: https://docs.python.org/3/library/io.html#io.BufferedReader.read in particular says "Read and return size bytes, or if size is not given or negative, until EOF or if the read call would block in non-blocking mode."

So having it return an empty string could remain consistent return type wise and within a reasonable set of expectations for something in non-blocking mode.

gpshead avatar Jul 31 '24 19:07 gpshead

Per my note on https://discuss.python.org/t/handling-sys-stdin-read-in-non-blocking-mode/59633/6, lets go with raising a BlockingIOError. Not returning None.

gpshead avatar Aug 09 '24 21:08 gpshead

I’ve created a new pull request with the required changes to TextIOWrapper, as well as a new test. The next step is to document these changes. I’ve looked into the documentation to find the best place to add this information, but my search hasn’t been conclusive.

I would appreciate some guidance on where to document this change, as I’m not entirely sure of the best approach. If someone else is available to handle the documentation, that would also be perfectly fine.

giosiragusa avatar Aug 12 '24 10:08 giosiragusa

For documentation, I think a note in the I/O docs TextIO section and BufferedIO would help people understand the behavior better from just the docs without having to go lookup the Python 3 I/O PEP. Something around that passing or changing a file descriptor which to be non-blocking is unsupported, and that generally the standard library will try to raise a NonBlockingIO error if non-blocking I/O is encountered. It might also make sense to add a similar note specifically around read

That documentation is generated from the file: https://github.com/python/cpython/blob/main/Doc/library/io.rst

cmaloney avatar Aug 14 '24 18:08 cmaloney

Thanks for the suggestion! I've added notes to the I/O docs to clarify how BlockingIOError can pop up when working with non-blocking streams. The new notes should help users understand this behavior without needing to dive into the Python 3 I/O PEP.

I hope these changes hit the mark and make things clearer for everyone!

giosiragusa avatar Aug 15 '24 20:08 giosiragusa

Just dropping a friendly ping here. This issue has a pull request that is in good shape and has undergone intensive review. Could a maintainer kindly take a look for final review and, if everything looks good, consider merging it?

giosiragusa avatar Nov 27 '24 09:11 giosiragusa

Merged. Thank you for your perserverance, @giosiragusa!

encukou avatar Dec 02 '24 13:12 encukou