Division/sqrt algorithm
Which algorithm does hardfloat use for division and square root?
There are two. DivSqrtRecF64 uses a quadratically convergent iterative algorithm, using a pre-multiplication to get a 10-ish-bit initial approximation. DivSqrtRecFN_small uses a straightforward one-bit-at-a-time algorithm.
Then how can I choose which one to use? Which performs better? And does DivSqrtRecFN_small use SRT? If so, which radix? Thank you very much for your reply.
It's too dependent on your application for me to provide an answer. The first option has latency of 25-ish cycles and throughput of 1/3 operations per cycle, but the area cost is high and the clock rate might be limited. The second option has latency of S-ish cycles (where S is the width of the significand) and has throughput of 1/S operations per cycle, but the area cost is low and it can run at higher clock rate.
DivSqrtRecFN_small doesn't use SRT; it's just one bit at a time.
Does DivSqrtRecF64 use algorithm of Goldschmidt or Newton-Raphson? @aswaterman And did you consider using the mainstream SRT algorithm?
Hi, I want to generate the Verilog of the division. Could you please tell me how to generate Verilog from hardfloat files? Thank you so much.
The original Verilog code is still available here: http://www.jhauser.us/arithmetic/HardFloat.html
FYI. I had a bare configurable SRT in sequencer/arithmetic. I'll find some time to PR hardfloat this year.