0.3.76 Windows-Portable AMD with 7900 XTX + APU just bails out
Custom Node Testing
- [x] I have tried disabling custom nodes and the issue persists (see how to disable custom nodes if you need help)
Expected Behavior
After clicking Run, it generates the output picture
Actual Behavior
After clicking Run, it bails out on or after CLIP/text encoder model (see logs)
Steps to Reproduce
Install fresh ComfyUI_windows_portable_amd.7z, install amd-software-adrenalin-edition-25.20.01.17-win11-pytorch-combined.exe like mentioned in the README on Windows10 22H2 Copy needed files for SDXLTurbo into Comfy Open template for SDXLTurbo and click RUN
Other models like Flux doesn't work too (same behaviour)
Debug Logs
Checkpoint files will always be loaded safely.
Total VRAM 24763 MB, total RAM 64670 MB
pytorch version: 2.9.0+rocmsdk20251116
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1036
ROCm version: (7, 1)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon(TM) Graphics : native
Enabled pinned memory 29101.0
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr 8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)]
ComfyUI version: 0.3.76
ComfyUI frontend version: 1.32.10
[Prompt Server] web root: C:\Users\egon\Desktop\a\ComfyUI\python_embeded\Lib\site-packages\comfyui_frontend_package\static
Total VRAM 24763 MB, total RAM 64670 MB
pytorch version: 2.9.0+rocmsdk20251116
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1036
ROCm version: (7, 1)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon(TM) Graphics : native
Enabled pinned memory 29101.0
Import times for custom nodes:
0.0 seconds: C:\Users\egon\Desktop\a\ComfyUI\ComfyUI\custom_nodes\websocket_image_save.py
Context impl SQLiteImpl.
Will assume non-transactional DDL.
No target revision found.
Starting server
To see the GUI go to: http://127.0.0.1:8188
got prompt
model weight dtype torch.float16, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
Requested to load SDXLClipModel
loaded completely; 95367431640625005117571072.00 MB usable, 1560.80 MB loaded, full load: True
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cuda:0, dtype: torch.float16
Other
No response
Has to set HIP_VISIBLE_DEVICES=1 to get it working (gfx1100 instead of gfx1036)
Checkpoint files will always be loaded safely.
Total VRAM 24560 MB, total RAM 64670 MB
pytorch version: 2.9.0+rocmsdk20251116
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1100
ROCm version: (7, 1)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 7900 XTX : native
Enabled pinned memory 29101.0
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr 8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)]
ComfyUI version: 0.3.76
ComfyUI frontend version: 1.32.10
[Prompt Server] web root: D:\a\ComfyUI\python_embeded\Lib\site-packages\comfyui_frontend_package\static
Total VRAM 24560 MB, total RAM 64670 MB
pytorch version: 2.9.0+rocmsdk20251116
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1100
ROCm version: (7, 1)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 7900 XTX : native
Enabled pinned memory 29101.0
Should be mentioned in the README, cause AMD GPU + AMD APU is not so rarely
Another point: I had to use D:\a\ComfyUI as path cause somehow it is compiled in:
C:\Users\egon\Desktop\a\ComfyUI\python_embeded\Scripts>amdgpu-arch.exe
Fatal error in launcher: Unable to create process using '"D:\a\ComfyUI\python_embeded\python.exe" "C:\Users\egon\Desktop\a\ComfyUI\python_embeded\Scripts\amdgpu-arch.exe" ': The system cannot find the specified file.
Not very portable ;)