jax icon indicating copy to clipboard operation
jax copied to clipboard

Jaxlib : Compiling/Building from source failed (Linux, amdgpu rocm)

Open unoexperto opened this issue 2 years ago • 2 comments

Description

Compilation log

python3 build/build.py --enable_rocm --rocm_path=/opt/rocm-6.0.2

error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr53 = V_MOV_B32_dpp undef $vgpr53(tied-def 0), $vgpr12, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr4 = V_MOV_B32_dpp undef $vgpr4(tied-def 0), killed $vgpr3, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr3 = V_MOV_B32_dpp undef $vgpr3(tied-def 0), $vgpr2, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr66 = V_MOV_B32_dpp undef $vgpr66(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr64 = V_MOV_B32_dpp undef $vgpr64(tied-def 0), $vgpr62, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr66 = V_MOV_B32_dpp undef $vgpr66(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr64 = V_MOV_B32_dpp undef $vgpr64(tied-def 0), $vgpr62, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr53 = V_MOV_B32_dpp undef $vgpr53(tied-def 0), $vgpr16, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr66 = V_MOV_B32_dpp undef $vgpr66(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr64 = V_MOV_B32_dpp undef $vgpr64(tied-def 0), $vgpr62, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr66 = V_MOV_B32_dpp undef $vgpr66(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr64 = V_MOV_B32_dpp undef $vgpr64(tied-def 0), $vgpr62, 322, 15, 15, 0, implicit $exec
12 errors generated when compiling for gfx1103.

How can I fix it ?

Thank you.

System info (python version, jaxlib version, accelerator, etc.)

Commit used

https://github.com/google/jax/tree/21656115847079981e3915f88ab4533790970f53

Environment

uname -m && cat /etc/*release

x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=23.10
DISTRIB_CODENAME=mantic
DISTRIB_DESCRIPTION="Ubuntu 23.10"
PRETTY_NAME="Ubuntu 23.10"
NAME="Ubuntu"
VERSION_ID="23.10"
VERSION="23.10 (Mantic Minotaur)"
VERSION_CODENAME=mantic
ID=ubuntu
ID_LIKE=debian

gcc --version (Ubuntu 13.2.0-4ubuntu3) 13.2.0

python3 --version Python 3.11.6

unoexperto avatar Feb 20 '24 15:02 unoexperto

Same issue, rocm-6.0.2 with RX6400 (gfx1034), the building process will stop when compiling components from xla. There is a variable: TF_ROCM_AMDGPU_TARGETS=gfx900,gfx906,gfx908,gfx90a,gfx1030 As a result, I am afraid that the jax is just currently compatible with the AMD GPU models mentioned above.

markliuchina avatar Feb 27 '24 02:02 markliuchina

Hi, currently JAX on ROCM is supported only for MI Instinct GPUs. We are working to get support for Navi/Radeon in the near future.

Thanks!

rahulbatra85 avatar Mar 11 '24 20:03 rahulbatra85

See also #19989.

brett-koonce avatar Mar 20 '24 17:03 brett-koonce