feat: update ROCM and use smaller image
Description
This is an attempt to not only update ROCM to 6.1, which according to the release notes should be compatible with every card that 6.0 is compatible with, but also to move away from the very very large -complete images and instead use the smaller rocm "base" images, while only pulling in the few individual things that we need from the -complete image.
This should save a couple of GB on the resulting images, as well as considerably speed up builds as we won't need to download an ~4.2GB image, but instead an ~970MB image.
Notes for Reviewers
I did some local testing, and I was able to build images, but I don't have a ROCM compatible GPU to do any in-depth testing/validation with.
Below is the complete list of packages that are installed in the -complete image that are not installed in the "base" image.
half
hipblas
hipblas-dev
hipblaslt
hipblaslt-dev
hipcub-dev
hipfft
hipfft-dev
hiprand
hiprand-dev
hipsolver
hipsolver-dev
hipsparse
hipsparse-dev
hipsparselt
hipsparselt-dev
hiptensor
hiptensor-dev
libamd2
libblas3
libcamd2
libccolamd2
libcholmod3
libcolamd2
libgfortran5
liblapack3
libmetis5
libsuitesparseconfig5
miopen-hip
miopen-hip-dev
rccl
rccl-dev
rocalution
rocalution-dev
rocblas
rocblas-dev
rocfft
rocfft-dev
rocm-libs
rocprim-dev
rocrand
rocrand-dev
rocsolver
rocsolver-dev
rocsparse
rocsparse-dev
rocthrust-dev
rocwmma-dev
Of those, the only ones I found that were required for the build to succeed are hipblas-dev and rocblas-dev which pulls in the following package set including dependencies:
hipblas
hipblas-dev
rocblas
rocblas-dev
rocsolver
rocsolver-dev
rocsparse
rocsparse-dev
We should do some testing and validation on this change in case there are other packages that are somehow needed at runtime but not needed at build time.
Signed commits
- [x] Yes, I signed my commits.
Deploy Preview for localai canceled.
| Name | Link |
|---|---|
| Latest commit | 564dbd92c22549889ea8f02d896bd918154a7706 |
| Latest deploy log | https://app.netlify.com/sites/localai/deploys/663458fe65dd940008570d8c |
I think it should be safe - however I don't have an AMD card to test this on, so we might need to collect feedback from master. Maybe @jtwolfe can give this a try?
Tested latest commit, and was able to get the Docker image built and running successfully. Tested multiple models on my Radeon RX 7900XT successfully. I used the sample prompts from the Getting Started page. Note that I did restart the local-ai container between tests, as I saw some errors in sequential tests due to RAM limitations.
✅ Text Generation:
{"created":1714752688,"object":"chat.completion","id":"23f73eb1-1f5d-4888-86d2-52562e9d1de9","model":"gpt-4","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"I'm doing well, thank you for asking. How about you?"}}],"usage":{"prompt_tokens":15,"completion_tokens":16,"total_tokens":31}}
✅ Image Preview:
{"created":1714752610,"object":"chat.completion","id":"69adf450-37fe-4e52-ae8f-95c5f479a908","model":"gpt-4-vision-preview","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"The image shows a path leading through a field of tall grass. The sky above is partly cloudy, suggesting a mix of sun and clouds. The path appears to be a wooden boardwalk or pathway, and it's surrounded by the natural environment. The grass is green, indicating it might be spring or summer. The overall scene conveys a sense of tranquility and solitude. \u003c/s\u003e"}}],"usage":{"prompt_tokens":1,"completion_tokens":82,"total_tokens":83}}
✅ Function Calling:
- Runs successfully, but I need to configure proper functions to get a valid result.
✅ Image Generation:
{"created":1714753952,"id":"ed8070f6-a019-4d24-b7f0-a0e81a7a46ae","data":[{"embedding":null,"index":0,"url":"http://localhost:8080/generated-images/b643165858024.png"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
✅ Text to Speech:
- Successfully created speech.mp3 with the audio.
❌ Speech to Text:
- Fails with this error likely due to needing ffmpeg compiled in:
ERR Server error error="rpc error: code = Unknown desc = error: exec: \"ffmpeg\": executable file not found in $PATH out: " ip=172.19.0.1 latency=3.46773989s method=POST status=500 url=/v1/audio/transcriptions
Logged this issue for the hipblas-dev package missing the ldconfig trigger: https://github.com/ROCm/ROCm/issues/3081
Tested latest commit, and was able to get the Docker image built and running successfully. Tested multiple models on my Radeon RX 7900XT successfully. I used the sample prompts from the Getting Started page. Note that I did restart the local-ai container between tests, as I saw some errors in sequential tests due to RAM limitations.
✅ Text Generation:
{"created":1714752688,"object":"chat.completion","id":"23f73eb1-1f5d-4888-86d2-52562e9d1de9","model":"gpt-4","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"I'm doing well, thank you for asking. How about you?"}}],"usage":{"prompt_tokens":15,"completion_tokens":16,"total_tokens":31}}✅ Image Preview:
{"created":1714752610,"object":"chat.completion","id":"69adf450-37fe-4e52-ae8f-95c5f479a908","model":"gpt-4-vision-preview","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"The image shows a path leading through a field of tall grass. The sky above is partly cloudy, suggesting a mix of sun and clouds. The path appears to be a wooden boardwalk or pathway, and it's surrounded by the natural environment. The grass is green, indicating it might be spring or summer. The overall scene conveys a sense of tranquility and solitude. \u003c/s\u003e"}}],"usage":{"prompt_tokens":1,"completion_tokens":82,"total_tokens":83}}✅ Function Calling:
* Runs successfully, but I need to configure proper functions to get a valid result.✅ Image Generation:
{"created":1714753952,"id":"ed8070f6-a019-4d24-b7f0-a0e81a7a46ae","data":[{"embedding":null,"index":0,"url":"http://localhost:8080/generated-images/b643165858024.png"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}✅ Text to Speech:
* Successfully created speech.mp3 with the audio.❌ Speech to Text:
* Fails with this error likely due to needing ffmpeg compiled in:ERR Server error error="rpc error: code = Unknown desc = error: exec: \"ffmpeg\": executable file not found in $PATH out: " ip=172.19.0.1 latency=3.46773989s method=POST status=500 url=/v1/audio/transcriptions
nice! thanks @linuxtek-canada for testing, just in time! :)