Add f16 support
This is a feature request to add f16 support.
This can be useful for machine learning, including (I believe) for having ndarray with f16 on wasm.
Rust has an f16 primitive (nightly-only), but some crates from the ecosystem use half::f16 -- in which case also offers the half::bf16 also.
Taking a look at how ndarray supports f32 and f64, it appears to easily support it, requiring some minor changes (leaving tests and documentation aside).
I'll try having a draft PR as a showcase, if this gets considered for inclusion.
Rust has an f16 primitive (nightly-only), but some crates from the ecosystem use half::f16 -- in which case also offers the half::bf16 also.
So, what's the plan? In the future, we will remove the half crate and use only the f16 primitive?
I saw your PR. Imo, since the effort of adding it, and thus the effort of maintaining it, is minimal, I'd vote for including f16 support. This being said, I'm curious to have @akern40 opinion on this matter.
So, what's the plan? In the future, we will remove the
halfcrate and use only thef16primitive?
Now that you mention it, I'm unsure how other crates would evolve. I believe that in the long run they would switch to the primitive f16.
It is possible to have a nightly feature and support the nightly f16 also.
PR-wise if you're ok I'm good to add it also. Before moving to tests and documentation I'm first trying to test the half::f16 feature in downstream crates.
Edit: I'm not sure about this, but alternatively, maybe it's possible to have a generic implementation relying on num-traits, for types that are able to have the five operations (Add, Sub, Mul, Div, Rem) -- here I think the Complex 2D types could also implement the Rem operation.
I'd agree with @nilgoyette - seems easy to maintain. I would caution against thinking that just bc Rust has something in nightly that means it will come to stable anytime soon 😅
(I'm a little iffy on our whole approach with ScalarOperand, but as long as we have it, let's add f16)
I was looking at the code, I see now why you use it haha, otherwise there's an conflict of implementation between array-and-scalar (where scalar is a generic type) versus array-and-array operations. ScalarOperand serves to diverge those impls, and having a negative trait (such as "this is not an array") is unstable and requires nightly.
I was going to also add the nightly f16, but num-traits doesn't support it -- and they prefer not doing so while it's unstable as per https://github.com/rust-num/num-traits/pull/333#discussion_r1731538112. So I think I'll be limiting the request/PR to the half::f16 and half::bf16 types.
I've just marked the PR as ready for review, please let me know if any change is desired.
- I've added tests and benches that seemed reasonable, but if a smaller change is preferred then I suggest for the last commit to be skipped (while keeping the feature mention on the Readme).
- Just a note: At least without any other features, the precision and speed seem to be quite low for those
halftypes. For the precision test, I've ran them many times and placed a precision requirement that I believe to have a good margin (intending to not trigger a failure in future runs). They were chosen empirically and I haven't made any formal calculation for those values.- The empirical method was: I ran the precision test 100 times for each case, took the worst case and ~doubled the ratio error, rounding up. So I expect to be <1% chance for the current precision requirements to trigger an error during a test run.
- Just a note: At least without any other features, the precision and speed seem to be quite low for those
- I've removed the
NdFloatimpl for thehalftypes because they're not related to linalg (nor blas, etc).