Parsing a missing CRLF at the end of a chunk in a malformed chunked encoding response
Describe the bug
A missing CRLF at the end of a chunk may happen for example when the parsed chunk size is wrong.
Parsing the end of a chunk is done by HttpPayloadParser.feed_data(), see this part of the code. When CRLF is missing:
- The HttpPayloadParser state remains
ChunkState.PARSE_CHUNKED_CHUNK_EOFuntil there are no more chunks to parse since the next chunks processed by the following calls to feed_data() are prefixed withself._chunk_tailso that this code parses always the same start of sequence of bytes (which does not start with CRLF). - All the next chunks are appended to
self._chunk_tail. - If the chunked body is large enough, such as an audio stream for example, the stream seems to hang.
To Reproduce
In a well formed chunked encoded HTTP response, replace the CRLF at the end of a chunk with two bytes that are not CRLF.
Expected behavior
A fix could be:
- Change also the parser state to ChunkState.PARSE_CHUNKED_SIZE when the CRLF is missing.
- Alternatively raise an exception.
Logs/tracebacks
-
Python Version
-
aiohttp Version
commit d3dc087b8e9aa665d47045550e5d9f2eddf8f512 (HEAD -> master, origin/master, origin/HEAD)
Date: Wed Jan 22 05:53:25 2025 -0800
multidict Version
-
propcache Version
-
yarl Version
-
OS
Related component
Server
Additional context
This issue is the result of a code review.
Code of Conduct
- [x] I agree to follow the aio-libs Code of Conduct
A fix could be the following patch:
diff --git a/aiohttp/http_parser.py b/aiohttp/http_parser.py
index d44f4cb1..8d7049ed 100644
--- a/aiohttp/http_parser.py
+++ b/aiohttp/http_parser.py
@@ -880,6 +880,12 @@ class HttpPayloadParser:
chunk = chunk[len(SEP) :]
self._chunk = ChunkState.PARSE_CHUNKED_SIZE
else:
+ if chunk and chunk != SEP[:1]:
+ exc = BadHttpMessage("Missing CRLF at chunk end")
+ set_exception(self.payload, exc)
+ raise exc
+ # This is an empty chunk or the chunk is equal to CR.
+ # Get the CRLF or the missing LF in the next chunk.
self._chunk_tail = chunk
return False, b""
That fix may be OK. If you'd like to open a PR, please create a regression test first in test_http_parser.py (making sure to use the response: HttpResponseParser fixture).
I will give it a try.