cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Asyncio BaseEventLoop can support socket types other than SOCK_STREAM

Open 1f78a49d-7e22-48b3-9972-2f81c03ac80c opened this issue 6 years ago • 21 comments

BPO 38285
Nosy @asvetlov, @1st1, @malversan, @callumquick

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-09-26.12:05:11.927>
labels = ['type-feature', '3.10', 'expert-asyncio']
title = 'Asyncio BaseEventLoop can support socket types other than SOCK_STREAM'
updated_at = <Date 2021-01-05.19:30:16.915>
user = 'https://github.com/malversan'

bugs.python.org fields:

activity = <Date 2021-01-05.19:30:16.915>
actor = 'malversan'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['asyncio']
creation = <Date 2019-09-26.12:05:11.927>
creator = 'malversan'
dependencies = []
files = []
hgrepos = []
issue_num = 38285
keywords = []
message_count = 20.0
messages = ['353296', '353297', '353298', '353300', '353316', '353320', '353322', '355945', '355948', '355949', '355954', '355958', '355960', '355962', '355969', '355979', '356023', '356035', '384422', '384427']
nosy_count = 5.0
nosy_names = ['asvetlov', 'yselivanov', 'malversan', 'callumquick', 'jbeeler']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue38285'
versions = ['Python 3.10']

Currently the BaseEventLoop class in asyncio has explicit checks to raise ValueError when creating a connection if the socket argument has a type other than SOCK_STREAM: .create_connection() .create_server()

This is also applicable for class _UnixSelectorEventLoop: .create_unix_connection() .create_unix_server()

But the fact is that it actually supports other socket types, like SOCK_SEQPACKET for example.

Currently you can test this by dirty-hacking the socket class "type" property to momentarily trick the event loop into thinking that any socket is of SOCK_STREAM type.

<code> # First create an AF_UNIX, SOCK_SEQPACKET socket. sock = socket.socket(socket.AddressFamily.AF_UNIX, socket.SocketKind.SOCK_SEQ_PACKET) sock.connect(path)

params = { "sock" : sock, "protocol_factory" : lambda: protocol }

# Now do the trick.
hack = (params["sock"].type != socket.SocketKind.SOCK_STREAM)

if hack:
    # Substitute class property getter with fixed value getter.
    socket_property = socket.socket.type
    socket.socket.type = property(lambda self: socket.SocketKind.SOCK_STREAM, None, None,)

# Use the socket normally to create connection and run the event loop.
loop = asyncio.new_event_loop()
coroutine = loop.create_unix_connection(**params)    # It also works with .create_connection()
transport, protocol = loop.run_until_complete(coroutine)

# Revert the trick.
if hack:
    # Restore class property getter.
    socket.socket.type = socket_property
</code>

As dirty as it looks, this works flawlessy. It just tricks the event loop .create_connection() call to bypass the explicit check of using a SOCK_STREAM socket. This done, THE EVENT LOOP SUPPORTS SOCK_SEQPACKET PERFECTLY.

This is the solution I'm currently using to communicate an application with a local daemon, but I would really prefer to have the SOCK_SEQPACKET support allowed into the event loop itself. Having in mind that it simply works with other socket types, I find that limiting the use of the event loop with an explicit SOCK_STREAM-only check is somehow artificial and unrealistic.

Thanks in advance for your attention.

I think if we want to support SOCK_SEQPACKET by asyncio we should do it explicitly.

In the other case, we can open a can of worms: we cannot guarantee that all existing protocols are supported by asyncio seamlessly.

Anyway, this is a new feature request. It can land on Python 3.9 only.

asvetlov avatar Sep 26 '19 12:09 asvetlov

Certainly I have only tested it with SOCK_SEQPACKET, but apparently no one has ever tested this before with a socket type other than SOCK_STREAM. It may be worth to consider the possibility that the current asyncio implementation may also support some other SocketKind sockets:

  • SOCK_SEQPACKET (tested)
  • SOCK_DGRAM
  • SOCK_RAW
  • SOCK_RDM

I agree this is an enhancement to incorporate in future releases. I do not expect previous versions to be patched.

We can implement these using the following procedure:

  1. Use only one socket type per pull request 2. Add support for, e.g. SOCK_SEQPACKET to asyncio code
  2. Add test(s) that checks that SOCK_SEQPACKET works fine (./Lib/test/test_asyncio folder, perhaps test_events.py)
  3. Merge the PR, repeat bullet 1) for the next socket type.

Would you work on pull requests?

asvetlov avatar Sep 26 '19 12:09 asvetlov

In the past it took me two days to analyze asyncio code, to think up and integrate the hack I´m using for this. But I´m not kidding when I tell you that it took me two years to find a while to come here and properly report it. I'm sorry, but I never have time to dedicate to other projects (I wish I could).

If you have no time for contribution -- that's fine, CPython is the Open Source project driven by volunteers.

The only caveat is that the issue may wait for years before we find a champion to pick it up.

For example, this particular problem is on the very end of my personal todo list because I don't use SOCK_SEQPACKET on my job (yet).

asvetlov avatar Sep 26 '19 15:09 asvetlov

I'm sorry to read that. I thought the report could be enough to reach whoever put that SOCK_STREAM-only checks and ask him why, when the library actually works well also with other socket types.

If I ever find enough time to dive into the CPython repository I will come back here, but given my work load I would not count on it. Anyway, as long as the issue remais opened I'm confident this will be eventually fixed.

Hi Andrew, I'm a new contributor, but this sounds like a pretty cool enhancement.

Would you be able to elaborate on what kind of things might be required to support each new socket type and test them in particular so I can see if I can take it on?

For each type, we need at least a test that creates a socket pair and successfully transfers data through the wire.

I don't know what additional things are required. For example, on reading about SOCK_SEQ_PACKET I've found that recvmsg() is highly recommended over recv() to get messages boundaries. It obviously requires the new transport type and the new async protocol specification.

Doesn't look trivial.

asvetlov avatar Nov 04 '19 15:11 asvetlov

A matter of creating tests to allow test enabling of new socket types I could attempt, but new protocol/transport types may be beyond me.

It has a certain logic to recommend recvmsg() in place of recv(), as SOCK_SEQ_PACKET is characterized by transmitting entire messages only. But it has to be noted that my current hack (described above) is working for SOCK_SEQ_PACKET sockets with no modification of the asyncio underlying reading logic.

How to get the message boundary without recvmsg()? Sorry, I'm not familiar with seqpacket.

asvetlov avatar Nov 04 '19 16:11 asvetlov

Another question: does SSL/TLS make sense for seqpacket?

asvetlov avatar Nov 04 '19 16:11 asvetlov

I do not have the answer about getting message boundaries at lower levels, but from a high-level point of view SOCK_SEQ_PACKET gives atomic reads, with no need to check for message boundaries yourself. Every time you read from a SOCK_SEQ_PACKET socket you get an entire message. That is the main difference with SOCK_STREAM, as far as I know.

Can recv() get two messages at once? What is the behavior if the buffer size passed into recv() is smaller than the message length?

asvetlov avatar Nov 04 '19 18:11 asvetlov

In my scenario that buffer overrun never happens, maybe because I use messages that are not big enough to overflow the default recv() buffer size.

But I think I can confirm that multiple messages are never received in an atomic read, even if they are being issued intensively in short millisecond intervals. Even more, I think there is a recvmmsg() call specific for that purpose if you want to receive multiple reads at once.

As I said I do not have the answers, I rely on the high-level definitions and have little knowledge about how it works at low level.

But I think your question may be extended also to recvmsg(). What is its behaviour if it fills all the passed iovec structs?

Probably an answer can be found where you found the recommendation of using recvmsg() over recv(). There should be a reason for that recommendation.

My point is: without a deep understanding we cannot "just enable" a new protocol. The evidence that it works in some limited scenarios is not enough for opening the can of worms. It is true for seqpacket, and especially true for other even not discussed protocols in general.

asvetlov avatar Nov 05 '19 11:11 asvetlov

I agree. Your question about potential message size overflow should be tested (either for recv() and recvmsg()).

Could you please link the resource where you found the recommendation of using recvmsg() over recv() for SOCK_SEQ_PACKET?

For what it's worth, a library I use currently hacks in this functionality by accessing the private method _create_connection_transport directly. This is done to allow a SOCK_RAW to be passed (which is itself required to then enable asyncio to be used for handling BLE).

This works, but as it requires the use of a private method, it's not ideal.

That means the core code also works for SOCK_RAW sockets. It's only limited by explicit socket type checks at a higher level.

As a curious note (not related to the issue), I'm also using the SOCK_SEQPACKET connection created with BaseEventLoop to access a custom daemon related to BLE functionality.

Currently asyncio only supports SOCK_STREAM. Adding others would require more changes than just relax the check, tests should be added, it needs to be verified on multiple platforms, it needs to be reliable etc. It also adds maintenance burden. This is a big project and would require a lot of work.

kumaraditya303 avatar Oct 14 '22 07:10 kumaraditya303

Closing as it has little use case and will add a lot of additional maintenance burden.

kumaraditya303 avatar May 17 '23 08:05 kumaraditya303