cpython gh-121313: Limit the reading size from pipes to their default buffer size on Unix systems

Issue: https://github.com/python/cpython/issues/121313

Issue: gh-121313

Jul 03 '24 09:07 aplaikner

All commit authors signed the Contributor License Agreement.

Jul 03 '24 09:07 ghost

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Jul 03 '24 09:07 bedevere-app[bot]

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Jul 03 '24 09:07 bedevere-app[bot]

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Jul 03 '24 10:07 bedevere-app[bot]

os.read() / _os_read_impl is used for reading from most kinds of files in Python. Definitely the limited size makes sense for pipes, but disk I/O generally wants "as big a read as possible". For instance reading regular files, such as python source code, one read call with a buffer that can fit the whole file is fastest in my experimenting. For both that case and the pipe case, it would be more efficient to figure out "whats the max read size" once (with the system calls that entails potentially) and re-use that for every subsequent read call

Following your chain of pieces, could this be made to be more targeted to the specific case potentially? Two thoughts

This is specifically caused by Lib/multiprocessing/connection.py, can that specify explicitly the size of read it wants?
Rather than checking / adjusting the size for every read, could that be done just when the pipe is opened/created? So on open, check type, and stash the "max read size". Compare against that (The code currently checks against _PY_READ_MAX constant, this would just be saying max read size is file type dependent, which is true on both Windows and Linux)

See also: gh-117151 which is aiming to increase the default size (albeit focused around write performance)

Jul 04 '24 22:07 cmaloney

I've tried shifting the check to Lib/multiprocessing/connection.py and it seems promising, yielding the same performance improvements as having the checks in the C code. The change to os_read_impl would be reverted and the following patch applied to Lib/multiprocessing/connection.py:

diff --git a/Lib/multiprocessing/connection.py b/Lib/multiprocessing/connection.py
index b7e1e13217..4797ca4df8 100644
--- a/Lib/multiprocessing/connection.py
+++ b/Lib/multiprocessing/connection.py
@@ -18,6 +18,7 @@
 import time
 import tempfile
 import itertools
+import stat
 
 
 from . import util
@@ -391,8 +392,17 @@ def _recv(self, size, read=_read):
         buf = io.BytesIO()
         handle = self._handle
         remaining = size
+        is_pipe = False
+        page_size = 0
+        if not _winapi:
+            page_size = os.sysconf(os.sysconf_names['SC_PAGESIZE'])
+            if size > 16 * page_size:
+                mode = os.fstat(handle).st_mode
+                is_pipe = stat.S_ISFIFO(mode)
+        limit = 16 * page_size if is_pipe else remaining
         while remaining > 0:
-            chunk = read(handle, remaining)
+            to_read = min(limit, remaining)
+            chunk = read(handle, to_read)
             n = len(chunk)
             if n == 0:
                 if remaining == size:

Jul 05 '24 07:07 aplaikner

I think as far as I can review / needs a python core dev / someone with more project familiarity to look for high level things.

Some lingering thoughts I have:

Would it make more sense to use fcntl F_GETPIPE_SZ rather than caluclate? I hadn't known about that until reading through the pipe man page linked.
How does this work for non-linux systems? Particularly FreeBSD and Apple systems that are Python supported (https://peps.python.org/pep-0011/#support-tiers). I'm not familiar with pipes on those platforms at all currently.

Jul 07 '24 18:07 cmaloney

When using fcntl, an additional system call per _recv would be necessary. The main issue is that the code must be executed inside the _recv function because fcntl requires the pipe's file descriptor. To avoid errors, a check to determine if the system is Windows would be needed before executing fcntl. This could be done with a boolean set inside the if _winapi check. Additionally, there should be a check to verify if the file descriptor belongs to a pipe before attempting to fetch the pipe size. This results in two checks before obtaining the pipe size. To optimize performance, these checks could be wrapped in another condition to verify if the read size is smaller than the default pipe size, skipping that code. Otherwise at least the fstat system call would be executed. However, this would again lead to a hardcoded value. Using fcntl would provide a more dynamic approach, it would come at the cost of reduced performance due to the additional system calls and other checks, reducing performance. I think the current solution covers most cases, where the default pipe size is used. If someone changes that value, they would also need to change the new constant to see some performance benefits.
I'm also not familiar with pipes on those systems, but it seems that FreeBSD and MacOS have both a default pipe buffer size of 64KiB: https://www.netmeister.org/blog/ipcbufs.html

Jul 07 '24 20:07 aplaikner

Hi @cmaloney, I wanted to check in and see if there are any additional steps I need to take for this pull request before it can be reviewed by a core developer.

Thank you!

Jul 29 '24 12:07 aplaikner

Re: Core Review, as far as I know no other steps needed. From https://devguide.python.org/getting-started/pull-request-lifecycle/#reviewing it's mainly just patience, that document suggests a month wait before pinging other locations.

Jul 29 '24 19:07 cmaloney

:robot: New build scheduled with the buildbot fleet by @gpshead for commit 52606d1529006ff849db06be0df5a2303cf620cf :robot:

If you want to schedule another build, you need to add the :hammer: test-with-buildbots label again.

Aug 31 '24 00:08 bedevere-bot

There's one potential further optimization, at least on Linux. fcntl F_GETPIPE_SZ on the fd if it is a pipe should return the actual size. A pipe might have been configured differently than the platform default. Regardless I don't expect that will have been the case within this multiprocessing code. Using that (and F_SETPIPE_SZ) could be a future enhancement (assuming it proves useful).

Aug 31 '24 05:08 gpshead

Thanks for taking this on!

Aug 31 '24 05:08 gpshead

2. I'm also not familiar with pipes on those systems, but it seems that FreeBSD and MacOS have both a default pipe buffer size of 64KiB: https://www.netmeister.org/blog/ipcbufs.html

This PR uses 256KiB, not 64KiB on M1 mac (16K page).

Aug 31 '24 07:08 methane

The Changelog entry was added to C API category, instead of the Library category.

Sep 02 '24 08:09 vstinner

Nice catch. I will change the category in #123559.

Sep 02 '24 10:09 methane