Handle responses larger than 4096 bytes
When running the proxy I encountered situations where connections would stall without an error. I tracked this down to the proxy not handling the case where a message spanned multiple receive buffers (response is greater than 4096 bytes). The proxy lost synchronization with the message boundaries, and therefore did not detect the ReadyForQuery message.
This change accounts for such cases.
Something that should be looked at is a timeout for the backend receive. Currently a message misalignment permanently locks a connection, and a timeout could have eventually terminated and freed it.
Finally got around to this again, and wrote a test case for this. The test surfaced another edge case that I have addressed with e1cd86e:
Sometimes the message boundary not only falls in the middle of a message, but in the middle of the message length portion of a message preamble. This caused a panic in the proxy:
pgbench_1 | panic: runtime error: slice bounds out of range [:5] with capacity 4
pgbench_1 |
pgbench_1 | goroutine 13 [running]:
pgbench_1 | github.com/crunchydata/crunchy-proxy/protocol.GetMessageLength(0xc000247ffc, 0x4, 0x4, 0x70)
pgbench_1 | /Users/kbbanman/kdev/crunchy-proxy/protocol/protocol.go:86 +0x100
pgbench_1 | github.com/crunchydata/crunchy-proxy/proxy.(*Proxy).HandleConnection(0xc000134930, 0xbde380, 0xc00000e060)
pgbench_1 | /Users/kbbanman/kdev/crunchy-proxy/proxy/proxy.go:292 +0x7ff
pgbench_1 | created by github.com/crunchydata/crunchy-proxy/server.(*ProxyServer).Serve
pgbench_1 | /Users/kbbanman/kdev/crunchy-proxy/server/proxy.go:60 +0x250
So, I've been giving this one thought. I kind of feel like the right course of action would be to just eliminate the 4096 limit in connect.Receive (and elsewhere). That way we don't have to worry about offsets and keeping track of whether or not a message is complete.
These proposed changes can be found on this branch: https://github.com/CrunchyData/crunchy-proxy/tree/large-response-fix
I've tested on my end with some rather large data sets and it seems to work as expected. And I'll likely want to do more testing before merging to master, but let me know if this resolves your issue.
That solution seems cleaner and passes my tests, so 👍