Jaxlib : Compiling/Building from source failed (Linux, amdgpu rocm)
Description
Compilation log
python3 build/build.py --enable_rocm --rocm_path=/opt/rocm-6.0.2
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr53 = V_MOV_B32_dpp undef $vgpr53(tied-def 0), $vgpr12, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr4 = V_MOV_B32_dpp undef $vgpr4(tied-def 0), killed $vgpr3, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr3 = V_MOV_B32_dpp undef $vgpr3(tied-def 0), $vgpr2, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr66 = V_MOV_B32_dpp undef $vgpr66(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr64 = V_MOV_B32_dpp undef $vgpr64(tied-def 0), $vgpr62, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr66 = V_MOV_B32_dpp undef $vgpr66(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr64 = V_MOV_B32_dpp undef $vgpr64(tied-def 0), $vgpr62, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr53 = V_MOV_B32_dpp undef $vgpr53(tied-def 0), $vgpr16, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr66 = V_MOV_B32_dpp undef $vgpr66(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr64 = V_MOV_B32_dpp undef $vgpr64(tied-def 0), $vgpr62, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr66 = V_MOV_B32_dpp undef $vgpr66(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr64 = V_MOV_B32_dpp undef $vgpr64(tied-def 0), $vgpr62, 322, 15, 15, 0, implicit $exec
12 errors generated when compiling for gfx1103.
How can I fix it ?
Thank you.
System info (python version, jaxlib version, accelerator, etc.)
Commit used
https://github.com/google/jax/tree/21656115847079981e3915f88ab4533790970f53
Environment
uname -m && cat /etc/*release
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=23.10
DISTRIB_CODENAME=mantic
DISTRIB_DESCRIPTION="Ubuntu 23.10"
PRETTY_NAME="Ubuntu 23.10"
NAME="Ubuntu"
VERSION_ID="23.10"
VERSION="23.10 (Mantic Minotaur)"
VERSION_CODENAME=mantic
ID=ubuntu
ID_LIKE=debian
gcc --version
(Ubuntu 13.2.0-4ubuntu3) 13.2.0
python3 --version
Python 3.11.6
Same issue, rocm-6.0.2 with RX6400 (gfx1034), the building process will stop when compiling components from xla.
There is a variable:
TF_ROCM_AMDGPU_TARGETS=gfx900,gfx906,gfx908,gfx90a,gfx1030
As a result, I am afraid that the jax is just currently compatible with the AMD GPU models mentioned above.
Hi, currently JAX on ROCM is supported only for MI Instinct GPUs. We are working to get support for Navi/Radeon in the near future.
Thanks!
See also #19989.