Fix and optimize IPv4 checksum calculation
Hi @espeer-enfabrica , Thanks for your contribution. Could you please explain / describe "[calculate IPv4 checksum 32-bits at a time]" more? would appreciate if you can give an example
Adding in the carries from the 16-bit sums is, when a 32-bit adder is used to sum the two 16-bit halves and fold the result back into 16-bits, equivalent to the implicit carry that happens when bit 16 rolls into bit 17 of a 32-bit number. The order of the adds doesn't matter, you can sum each 16 bits of the input data in turn, or you can equivalently sum every even 16-bit and every odd 16-bit in parallel, which is what the 32-bit add is doing, and then sum the result. Whether you add the carries incrementally or all at the end, also makes no difference.
As an example consider the same calculation with a 4-bit adder, with the requirement of performing 2-bit adds, adding the carries in binary, we get:
(01 + 11) + 01 -- 2-bit add --> (00 + carry) + 01 -- 2-bit add plus the carry --> 10 = 2
0111 + 0100 -- 4-bit add --> 1011 --> split on 2-bit boundary and fold --> 10 + 11 => 01 + carry => 10 = 2
Thanks for your contribution! merged