aiohttp icon indicating copy to clipboard operation
aiohttp copied to clipboard

Parsing a missing CRLF at the end of a chunk in a malformed chunked encoding response

Open xdegaye opened this issue 1 year ago • 3 comments

Describe the bug

A missing CRLF at the end of a chunk may happen for example when the parsed chunk size is wrong.

Parsing the end of a chunk is done by HttpPayloadParser.feed_data(), see this part of the code. When CRLF is missing:

  • The HttpPayloadParser state remains ChunkState.PARSE_CHUNKED_CHUNK_EOF until there are no more chunks to parse since the next chunks processed by the following calls to feed_data() are prefixed with self._chunk_tail so that this code parses always the same start of sequence of bytes (which does not start with CRLF).
  • All the next chunks are appended to self._chunk_tail.
  • If the chunked body is large enough, such as an audio stream for example, the stream seems to hang.

To Reproduce

In a well formed chunked encoded HTTP response, replace the CRLF at the end of a chunk with two bytes that are not CRLF.

Expected behavior

A fix could be:

  • Change also the parser state to ChunkState.PARSE_CHUNKED_SIZE when the CRLF is missing.
  • Alternatively raise an exception.

Logs/tracebacks

-

Python Version

-

aiohttp Version

commit d3dc087b8e9aa665d47045550e5d9f2eddf8f512 (HEAD -> master, origin/master, origin/HEAD)
Date:   Wed Jan 22 05:53:25 2025 -0800

multidict Version

-

propcache Version

-

yarl Version

-

OS

Related component

Server

Additional context

This issue is the result of a code review.

Code of Conduct

  • [x] I agree to follow the aio-libs Code of Conduct

xdegaye avatar Jan 23 '25 13:01 xdegaye

A fix could be the following patch:

diff --git a/aiohttp/http_parser.py b/aiohttp/http_parser.py
index d44f4cb1..8d7049ed 100644
--- a/aiohttp/http_parser.py
+++ b/aiohttp/http_parser.py
@@ -880,6 +880,12 @@ class HttpPayloadParser:
                         chunk = chunk[len(SEP) :]
                         self._chunk = ChunkState.PARSE_CHUNKED_SIZE
                     else:
+                        if chunk and chunk != SEP[:1]:
+                            exc = BadHttpMessage("Missing CRLF at chunk end")
+                            set_exception(self.payload, exc)
+                            raise exc
+                        # This is an empty chunk or the chunk is equal to CR.
+                        # Get the CRLF or the missing LF in the next chunk.
                         self._chunk_tail = chunk
                         return False, b""

xdegaye avatar Jan 23 '25 15:01 xdegaye

That fix may be OK. If you'd like to open a PR, please create a regression test first in test_http_parser.py (making sure to use the response: HttpResponseParser fixture).

Dreamsorcerer avatar Jan 23 '25 16:01 Dreamsorcerer

I will give it a try.

xdegaye avatar Jan 24 '25 19:01 xdegaye