unexpected window size impact on tls conncetions with h2 large uploads
We are using H2 to transfer 16 MB payloads. During our benchmarking, we are noticing a significant difference in b/w when we are using streaming frames.
I have put together a small benchmark repo where I am sending that payload 100 times, and I see an approximately 2x difference locally as well as when I run the client and servers separately on EC2 boxes.
This is not a functionality issue, but I am wondering if there is a performance guideline around the use of frames for performance.
Could you say a little more? HTTP/2 is defined by sending and receiving frames, so any communication will always use them.
Do you mean using smaller frames? Or something else? What are you comparing?
Thanks for the reply, @seanmonstar
Yes, H2 is being used always. The difference between the 2 cases is simply the size of the frames and I am perplexed why there is such a big difference as a result.
To give you even more context, we are using H2 as our bulk data transmission protocol and benchmarking on the new ENA enabled EC2 hosts. What we are seeing there is that using large H2 frames is causing the overall b/w to be capped at 2 Gpbs independent of the amount of parallelism: this is in stark contrast to using smaller frames where we can transmit upto 15 Gbps. The microbenchmark in the sample repo is my attempt to recreate the issue in a local manner.
An update on this, I made a modification to end_to_end.rs with the attached patch file. 0001-modifiying-end-to-end-benches-to-show-tls-impact.patch
When you apply the path, please run scripts/local_certs.sh and then the command
cargo bench --features="full" --bench end_to_end http2_parallel_x10_req_100kb
On my machine, I got the following results
test http2_parallel_x10_req_100kb_100_chunks ... bench: 27,550,408.40 ns/iter (+/- 1,516,658.37) = 3716 MB/s
test http2_parallel_x10_req_100kb_100_chunks_adaptive_window ... bench: 25,833,849.90 ns/iter (+/- 1,676,144.25) = 3963 MB/s
test http2_parallel_x10_req_100kb_100_chunks_adaptive_window_tls ... bench: 88,947,912.50 ns/iter (+/- 27,584,394.02) = 1151 MB/s
test http2_parallel_x10_req_100kb_100_chunks_max_window ... bench: 25,593,699.90 ns/iter (+/- 2,652,152.47) = 4000 MB/s
test http2_parallel_x10_req_100kb_100_chunks_max_window_tls ... bench: 64,975,587.50 ns/iter (+/- 25,294,327.91) = 1575 MB/s
test http2_parallel_x10_req_100kb_100_chunks_tls ... bench: 66,979,683.30 ns/iter (+/- 2,965,548.45) = 1528 MB/s
I think this shows the issue clearly because there is a significant drop in upload bandwidth with the introduction of tls, and tls seems to negatively impact the adaptive_window and max_window the most