Fix mamba installation on AMD GPUs: clang++: error: cannot determine amdgcn architecture: /opt/rocm/lib/llvm/bin/amdgpu-arch: ; consider passing it via '--offload-arch'
Hi Folks (especially AMD GPUs' trainer),
While installing/building mamba-ssm from repository, if you have faced issues with cannot determine amdgcn architecture and the solution suggested being consider passing it via '--offload-arch'. Then here is the solution.
[NOTE]: This have been tried with the rocm/pytorch-training docker image.
First of all, try to checkout to a particular commit of mamba repo to avoid possible errors like NameError: name 'bare_metal_version' is not defined and then export a variable that overrides the possibility of setting --offload_arch=native as done in setup.py of mamba repo.
Steps:
git clone https://github.com/state-spaces/mamba.git mamba_ssm
cd mamba_ssm
git checkout 014c094
export HIP_ARCHITECTURES="gfx942" # For MI300 only. Replace it with your architecture(s)
pip install --no-cache-dir --verbose .
I tried searching for the
HIP_ARCHITECTURESvariable by loading docker image andos.getenv(), but couldn't find any.