Float8s.jl issues

5 bits for exponent?

13

the new H100 from nvidia has 8-bit floats in two flavors: 4 bits for the exponent like Float8s.jl's Float8_4, and 5 bits. scroll down to "NVIDIA Hopper FP8 data format"...

bjarthur

enhancement

Float8_4 subnormal bug?

4

the README says there is one. is that still the case?

bjarthur

bug

0xff NaN converted to 0

just found this ```julia julia> Float8_4(0xff) NaN8_4 julia> Float32(Float8_4(0xff)) 0.0f0 julia> Float8(0xff) NaN8 julia> Float32(Float8(0xff)) 0.0f0 ``` Which obv shouldn't happen.

milankl

bug

conversion

10

Just found this ```julia julia> Float8(0.015) Float8(0.0625) julia> Float8(0.02) Float8(0.015625) ``` although `0.015 bitstring(Float8(0.015)) "00000100" julia> bitstring(Float8(0.02)) "00000001" ```

milankl

bug

Float8s.jl
Float8s.jl copied to clipboard

Metadata

5 bits for exponent?

Float8_4 subnormal bug?

0xff NaN converted to 0

conversion

Define `eps`, `exponent`, `precision`

Add sinpi, cospi, tanpi, cis, cispi

Fix warning for `Base.Bool`

← Metadata

Owner

Metadata

Float8s.jl Float8s.jl copied to clipboard

Metadata

5 bits for exponent?

Float8_4 subnormal bug?

0xff NaN converted to 0

conversion

Define `eps`, `exponent`, `precision`

Add sinpi, cospi, tanpi, cis, cispi

Fix warning for `Base.Bool`

← Metadata

Owner

Metadata

Float8s.jl
Float8s.jl copied to clipboard