h11 icon indicating copy to clipboard operation
h11 copied to clipboard

Robust handling of whitespace-malformed chunk encoding headers.

Open lovelydinosaur opened this issue 4 years ago • 1 comments

Prompted by https://github.com/encode/httpx/discussions/1735

When a server uses chunked encoding, but erroneously includes whitespace between the <chunk size marker> and the <crlf>, a RemoteProtocolError is raised...

>>> import h11
>>> state = h11.Connection(our_role=h11.CLIENT)
>>> state.send(h11.Request(method=b'GET', target='b/', headers=[(b'Host', 'example.com')]))
b'GET b/ HTTP/1.1\r\nHost: example.com\r\n\r\n'
>>> state.send(h11.EndOfMessage())
b''
>>> state.receive_data(b'HTTP/1.1 200 OK\r\n')
>>> state.receive_data(b'Transfer-Encoding: chunked\r\n')
>>> state.receive_data(b'\r\n')
>>> state.receive_data(b'123  \r\n')
>>> state.receive_data(b'x' * 123)
>>> state.receive_data(b'\r\n0\r\n\r\n')
>>> state.next_event()
Response(status_code=200, headers=<Headers([(b'transfer-encoding', b'chunked')])>, http_version=b'1.1', reason=b'OK')
>>> state.next_event()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "venv/lib/python3.8/site-packages/h11/_connection.py", line 443, in next_event
    exc._reraise_as_remote_protocol_error()
  File "venv/lib/python3.8/site-packages/h11/_util.py", line 76, in _reraise_as_remote_protocol_error
    raise self
  File "venv/lib/python3.8/site-packages/h11/_connection.py", line 425, in next_event
    event = self._extract_next_receive_event()
  File "/venv/lib/python3.8/site-packages/h11/_connection.py", line 367, in _extract_next_receive_event
    event = self._reader(self._receive_buffer)
  File "venv/lib/python3.8/site-packages/h11/_readers.py", line 157, in __call__
    matches = validate(
  File "venv/lib/python3.8/site-packages/h11/_util.py", line 88, in validate
    raise LocalProtocolError(msg)
h11._util.RemoteProtocolError: illegal chunk header: bytearray(b'123  \r\n')

No doubt this is technically correct, but we probably want to be robust to this kind of malformed input, given that we've seen it occur as a user-issue in httpx.

lovelydinosaur avatar Jul 06 '21 12:07 lovelydinosaur

isn't maintaining an http client fun? anyway yeah, fair enough. probably just needs a tweak to the chunk_header regex in h11._abnf, if you want to throw together a PR.

njsmith avatar Jul 06 '21 13:07 njsmith

Fixed by 26ec787d44aacbff8fbc0fc1af7e3213dd993d46

pgjones avatar Aug 24 '22 18:08 pgjones

After failing to find this issue in my initial searching and then spending a few hours dealing with the same problem. I wrote up an issue and went to link to the line where this was fixed only to find out that there was already a fix there. Any chance there will be a new release of h11 soon? Looks like there was a bunch of updates around Aug 24/25, that might justify one?

At least now that I found this I feel a little less bad about monkey patching the regex.

eseglem avatar Sep 25 '22 02:09 eseglem

Agree, 0.14.0 has been released.

pgjones avatar Sep 25 '22 15:09 pgjones