gtensor icon indicating copy to clipboard operation
gtensor copied to clipboard

Enable gt half precision types for HIP

Open cmpfeil opened this issue 11 months ago • 0 comments

Uses 16 bit FP types from HIP headers (<hip/hip_fp16.h>, <hip/hip_bf16.h>) when CUDA headers not available.

BF16 tests pass on AMD MI300A when built with module rocm/6.3 loaded via

cmake -S . -B build-hip -DCMAKE_INSTALL_PREFIX=build-hip -DGTENSOR_DEVICE=hip -DBUILD_TESTING=ON -DGTENSOR_ENABLE_BF16=ON -DCMAKE_CXX_COMPILER=$(which hipcc)
cmake --build build-hip --target install

(FP16 tests pass analogously, when built with -DGTENSOR_ENABLE_FP16=ON)

cmpfeil avatar Feb 12 '25 13:02 cmpfeil