gpu docker image "exec format error" on ARM64 with Nvidia Card
Describe the bug
The kokoro-fastapi-gpu images (latest and v0.2.4) both log exec /opt/nvidia/nvidia_entrypoint.sh: exec format error on launch and exit. There was a time when I was able to build this project using torch 2.6.0 to get it to work, it looks like newer pytorch is not as friendly to ARM
Screenshots or console output docker logs are
exec /opt/nvidia/nvidia_entrypoint.sh: exec format error
Branch / Deployment used I've started trying to build the latest master branch but am having little luck. the latest and v0.2.4 docker images were tried.
Operating System K8s, Nvidia device plugin, Nvidia container runtime, Ampere Altra CPU, RTX A6000
Additional context Add any other context about the problem here.
I'm seeing the same issue when running this on a dgx spark.
I'm seeing the same issue when running this on a dgx spark.
Does my image work for you? https://github.com/users/rfhold/packages/container/package/kokoro-fastapi-gpu. built from https://github.com/remsky/Kokoro-FastAPI/pull/403
I'm seeing the same issue when running this on a dgx spark.
Does my image work for you? https://github.com/users/rfhold/packages/container/package/kokoro-fastapi-gpu. built from #403
It got me further, but now seeing this:
ERROR | main:70 | Failed to initialize model: Warmup failed: Failed to load model: Failed to load Kokoro model: CUDA error: no kernel image is available for execution on the device
Search for `cudaErrorNoKernelImageForDevice' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assert
DGX probably needs to have a newer version of pytorch. I hope that it's existence helps make CUDA on ARM a little more standard
Also seeing this same issue on DGX spark as of 2025-11-24.