Only send pong if OPEN.
If the server sends a message and then closes the connection, it's possible for the client to fail to recv() the message due to an attempt to send a pong message during the CLOSING state. The following is printed:
ERROR:websockets.client:parser failed
Traceback (most recent call last):
File "/tmp/throwaway-txYF/lib/python3.10/site-packages/websockets/protocol.py", line 550, in parse
self.recv_frame(frame)
File "/tmp/throwaway-txYF/lib/python3.10/site-packages/websockets/protocol.py", line 634, in recv_frame
self.send_frame(pong_frame)
File "/tmp/throwaway-txYF/lib/python3.10/site-packages/websockets/protocol.py", line 693, in send_frame
raise InvalidState(
websockets.exceptions.InvalidState: cannot write to a WebSocket in the CLOSING state
I found that modifying websockets/protocol.py to only send a pong message during the OPEN state allowed me to .recv() the message in this case.
Thank you for reporting this bug!
The legacy implementation includes this logic, suggesting that this is the right fix in the Sans-I/O layer. I'd like to double check and add a test before merging.
If you're able to reproduce easily, would you be able to provide debug logs by any chance? That would help me understand exactly the scenario in which this happens. In short, you can enable debug logs with:
import logging
logging.basicConfig(format="%(asctime)s %(message)s", level=logging.DEBUG,
So I've realised that this is just an exception being logged by websockets within the context of recv() - the caller still receives the expected message.
I can replicate this reliably in the REPL, relevant logs as follows:
>>> r = n.record(stop_after_seconds=1)
2024-01-11 11:26:40,673 = connection is CONNECTING
2024-01-11 11:26:40,673 > GET /ws/recording HTTP/1.1
2024-01-11 11:26:40,673 > Host: localhost:1024
2024-01-11 11:26:40,673 > Upgrade: websocket
2024-01-11 11:26:40,673 > Connection: Upgrade
2024-01-11 11:26:40,673 > Sec-WebSocket-Key: sZdHgvpcSEDx5Xb1TrkZhA==
2024-01-11 11:26:40,673 > Sec-WebSocket-Version: 13
2024-01-11 11:26:40,673 > Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits
2024-01-11 11:26:40,673 > User-Agent: Python/3.10 websockets/12.0
2024-01-11 11:26:40,674 < HTTP/1.1 101 Switching Protocols
2024-01-11 11:26:40,674 < Server: TornadoServer/6.4
2024-01-11 11:26:40,674 < Date: Thu, 11 Jan 2024 00:26:40 GMT
2024-01-11 11:26:40,674 < Upgrade: websocket
2024-01-11 11:26:40,674 < Connection: Upgrade
2024-01-11 11:26:40,674 < Sec-Websocket-Accept: uIQSgzo459s7j6C8XsEhagwnPgU=
2024-01-11 11:26:40,674 = connection is OPEN
2024-01-11 11:26:40,674 > TEXT '{"action": "start", "stop_after_seconds": 1}' [44 bytes]
2024-01-11 11:26:40,674 Waiting for status: started
2024-01-11 11:26:40,992 < TEXT '{"file": {"name": "2024-01-11.00-26-40.676136.r...9, "status": "started"}' [186 bytes]
2024-01-11 11:26:40,992 Recieved recording message: {"file": {"name": "2024-01-11.00-26-40.676136.recording.h5", "location": "/tmp", "path": "/tmp/2024-01-11.00-26-40.676136.recording.h5"}, "start_timestamp": 7154849, "status": "started"}
2024-01-11 11:26:40,992 Status is: started
>>> 2024-01-11 11:26:41,992 < TEXT '{"status": "stopped"}' [21 bytes]
>>>
>>> r.wait_until_stopped()
2024-01-11 11:27:18,364 Waiting for status: stopped
2024-01-11 11:27:18,364 Recieved recording message: {"status": "stopped"}
2024-01-11 11:27:18,364 Status is: stopped
2024-01-11 11:27:18,364 > CLOSE 1000 (OK) [2 bytes]
2024-01-11 11:27:18,365 = connection is CLOSING
2024-01-11 11:27:18,365 < PING '' [0 bytes]
2024-01-11 11:27:18,365 parser failed
Traceback (most recent call last):
File "/tmp/throwaway-oOZF/lib/python3.10/site-packages/websockets/protocol.py", line 550, in parse
self.recv_frame(frame)
File "/tmp/throwaway-oOZF/lib/python3.10/site-packages/websockets/protocol.py", line 634, in recv_frame
self.send_frame(pong_frame)
File "/tmp/throwaway-oOZF/lib/python3.10/site-packages/websockets/protocol.py", line 693, in send_frame
raise InvalidState(
websockets.exceptions.InvalidState: cannot write to a WebSocket in the CLOSING state
2024-01-11 11:27:18,365 < EOF
2024-01-11 11:27:18,365 > EOF
2024-01-11 11:27:18,365 = connection is CLOSED
The logic sequence is as follows:
- Client opens connection to server and sends 'start' request, starts waiting for 'started' response
- Server receives start request and responses with started
- Client receives 'started' and continues to other tasks
- 1 second later, server sends 'stopped' and closes the connection
- Some time after that, client starts waiting for 'stopped' response
- Client receives and processes 'stopped' response - in the context of recv(), an exception is logged.
EDIT: formatting EDIT2: clarity
I discovered that I wasn't closing the websocket server-side after sending the 'stopped' message - I was just dereferencing it. Fixing that seemed to fix the client behaviour. That said, I still think the client shouldn't attempt to send a pong message if it is not in the OPEN state.
Thank you for the details - that's very useful!
The commit message for b2a95c45fae19fff0a3473158dd02afe2ca42604 suggests that the current behavior of the Sans-I/O layer is intentional. Mmm.
For clarity, the problem happens in the following scenario:
-
Protocol.send_close()sends a Close frame and changes the connection state is CLOSING -
Protocol.receive_data()receives data containing a Ping frame — this is a race condition between us sending the Close and the other side sending the Ping-
Protocol.parsecallsProtocol.recv_frame(frame), whereframeis the Ping frame -
Protocol.recv_framecallsProtocol.send_frame(pong_frame)to answer the ping -
Protocol.send_framecrashes because the connection is CLOSING
-
This can happen either on the client side or the server side.
Above I described the scenario where the connection state is CLOSING because we initiated the closing handshake.
The connection could also be closing because the other side initiated the closing handshake. However, in that case, websockets will never read a Ping frame after a Close frame — because it stops reading entirely after a Close frame:
https://github.com/python-websockets/websockets/blob/5209b2a1cba00b28b8f62502157d5dbb98625a49/src/websockets/protocol.py#L677-L681
RFC 6455 says:
Upon receipt of a Ping frame, an endpoint MUST send a Pong frame in response, unless it already received a Close frame.
Note that it doesn't say "... unless it already sent a Close frame".
Also, it says:
The application MUST NOT send any more data frames after sending a Close frame.
This implies that it's OK to send more control frames, such as Pong frames.
(Since Ping/Pong are primarily a keepalive / heartbeat mechanism, it doesn't seem very useful to send a Pong after a Close. In fact, on the other side of the connection, websockets will ignore the Pong in this scenario — as mentioned in the previous comment it stops reading data frames and control frames after a Close frame. But, hey, it's legal.)
Also, there's comments in the current test suite pointing out that the current behavior doesn't match the spec.
All in all, I'm leaning towards following the spec more closely, meaning allowing sending ping and pong after close.
Superseded by #1435.
In an earlier comment, you said:
I can replicate this reliably in the REPL
If you could double-check that #1435 fixes the problem for you, that would be most useful :-)
Sorry for delay - I just confirmed that your fix works for my test case. Thanks! Any feel for when it might make it into a release?