trafficserver icon indicating copy to clipboard operation
trafficserver copied to clipboard

Do some core intercept plugins leak resources on early session termination?

Open ywkaras opened this issue 3 years ago • 4 comments

Specifically these:

acme healthchecks ts_lua_http_intercept stats_over_http

They seem to only do cleanup of allocated resources when they get the TS_EVENT_VCONN_WRITE_COMPLETE event. Does this event occur, even if the session drops before the transaction completes?

ywkaras avatar Jun 08 '22 17:06 ywkaras

@cmcfarlen is going to take a look at this

bryancall avatar Jun 13 '22 23:06 bryancall

Looking at the code, it definitely looks like a memory leak is possible if the TS_EVENT_VCONN_WRITE_COMPLETE is not received. I tried to cause this condition with various versions of netcat and then ended up writing a little program that just closes the socket after sending the request. Try as I might, the plugin always saw the TS_EVENT_VCONN_WRITE_COMPLETE event.

Looking at a packet capture, the writes still happen after the FIN is sent:

2	0.000148	127.0.0.1	127.0.0.1	TCP	68	8080 → 64978 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=16344 WS=64 TSval=3383315298 TSecr=2143886768 SACK_PERM=1
3	0.000166	127.0.0.1	127.0.0.1	TCP	56	64978 → 8080 [ACK] Seq=1 Ack=1 Win=408256 Len=0 TSval=2143886768 TSecr=3383315298
4	0.000177	127.0.0.1	127.0.0.1	TCP	56	[TCP Window Update] 8080 → 64978 [ACK] Seq=1 Ack=1 Win=408256 Len=0 TSval=3383315298 TSecr=2143886768
5	0.000210	127.0.0.1	127.0.0.1	HTTP	94	GET /_stats HTTP/1.1 
6	0.000218	127.0.0.1	127.0.0.1	TCP	56	64978 → 8080 [FIN, ACK] Seq=39 Ack=1 Win=408256 Len=0 TSval=2143886768 TSecr=3383315298
7	0.000240	127.0.0.1	127.0.0.1	TCP	56	8080 → 64978 [ACK] Seq=1 Ack=39 Win=408256 Len=0 TSval=3383315298 TSecr=2143886768
8	0.000247	127.0.0.1	127.0.0.1	TCP	56	8080 → 64978 [ACK] Seq=1 Ack=40 Win=408256 Len=0 TSval=3383315298 TSecr=2143886768
9	0.008849	127.0.0.1	127.0.0.1	TCP	16388	8080 → 64978 [ACK] Seq=1 Ack=40 Win=408256 Len=16332 TSval=3383315307 TSecr=2143886768 [TCP segment of a reassembled PDU]
10	0.008852	127.0.0.1	127.0.0.1	TCP	12728	8080 → 64978 [PSH, ACK] Seq=16333 Ack=40 Win=408256 Len=12672 TSval=3383315307 TSecr=2143886768 [TCP segment of a reassembled PDU]
11	0.008864	127.0.0.1	127.0.0.1	HTTP	8921	HTTP/1.1 200 OK  (text/json)
12	0.008906	127.0.0.1	127.0.0.1	TCP	44	64978 → 8080 [RST] Seq=40 Win=0 Len=0
13	0.008912	127.0.0.1	127.0.0.1	TCP	44	64978 → 8080 [RST] Seq=40 Win=0 Len=0
14	0.008916	127.0.0.1	127.0.0.1	TCP	44	64978 → 8080 [RST] Seq=40 Win=0 Len=0

Finally, I observed that the WRITE_COMPLETE event occurs before any outgoing packets hit the wire. So this event is handled after the internal buffer is filled up, not after the bytes are done sending.

I still think it might be possible to cause leaks to occur if other errors happen or perhaps if a send buffer become full, but doesn't happen if the client sends a valid request and then closes the socket.

If you can think of another way to cause the cleanup to be missed, I can try that out as well.

cmcfarlen avatar Jun 23 '22 22:06 cmcfarlen

Maybe it's only a problem if the plugin intercepts requests with content, (POSTs, PUSHes)? I would guess TS waits to receive the complete request header before calling any hooked plugin continuations.

I wonder that happens if some plugin does a reenable with TS_HTTP_EVENT_ERROR on some hook, is the WRITE_COMPLETE event still generated?

ywkaras avatar Jun 27 '22 15:06 ywkaras

ts_lua_http_intercept is a server intercept, so maybe the problem there would be if server sends no or incomplete data.

ywkaras avatar Jun 27 '22 15:06 ywkaras

This issue has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.

github-actions[bot] avatar Jun 28 '23 02:06 github-actions[bot]