CPU cores are not fully utilized when using standalone build for Windows (torch)
Feature Idea
When running comfyui standalone build for windows cpu version, the utilization of all cpu cores are not implemented for torch. Adding few lines of code to ComfyUI/comfy/utils.py or main.py will help a lot of users who don't have GPU to fully utilize comfyui.
First option to just add the below lines without checking if gpu is used import os ####### cpuCount = os.cpu_count() ########### torch.set_num_threads(cpuCount) ############ torch.set_num_interop_threads(cpuCount) ############
or adding the below lines to main.py line 82 or any other location you think it is better. if args.windows_standalone_build: try: import fix_torch if args.cpu: ########### import torch ############# cpuCount = os.cpu_count() ########### torch.set_num_threads(cpuCount) ############ torch.set_num_interop_threads(cpuCount) ############
Also adding this option in --use-split-cross-attention in the run_cpu.bat
.\python_embeded\python.exe -s ComfyUI\main.py --cpu --windows-standalone-build --use-split-cross-attention
pause
Existing Solutions
No response
Other
No response
I tried applying this code, but there was absolutely no performance improvement, at least on Ryzen + Linux. For an SD1.5 512x512 20-steps image, it took exactly the same 1m 7s.
Interestingly, when the code was applied, the 'top' command showed that the CPU was being used more intensively than before application.
Is there a performance improvement on Windows?
without increasing the number of cores in the code, my machine (windows 11) only utilizes one logical processor out of 20 whenever torch is utilized in the process and it takes a long time to finish small workflow. But now the cpu is fully utilized to 100% and it take very little time to finish. My cpu is 12th Gen Intel Core i7-12700. it has 12 cores and 20 logical processors
Not sure about Linux, but with Windows 11, it's a Python issue. If you run Python as admin, it will use all cores and without needing to add any extra code or launch flags. If you're using a shortcut to launch ComfyUI, you'll probably need to modify since launching as admin changes a few things with how Windows opens command prompts/terminals/etc.
I had to edit mine to look like:
g:
cd ComfyUI
.\python_embeded\python.exe -s .\ComfyUI\main.py --windows-standalone-build
pause
Where I manually pathed to my comfy folder before launching it, even though the shortcut's working directory is set correctly. (I think the command prompt that gets used when launching with admin is probably just launched in the windows\system32 folder, so that's why you need to manually shift the working directory to launch comfy correctly)
I had this issue with my CPU since I have a 13600kf. Python was always exclusively using only my E-cores and doing the above allowed it to use all 20 threads of my CPU when needed.
One more thing for windows 11 users. CPU utilization will also drop when the command prompt used to start comfyiu is not in focus. To avoid this problem run the below command as admin. Make sure to change the path ("D:\ai\python_embeded\python.exe") to where embedded python is located
C:\Windows\System32>powercfg /powerthrottling disable /path "D:\ai\python_embeded\python.exe"
This seems more like a win11 issue tbh, on all my win10 servers running various ai's it caps the cpu core when using, to the point where I can enable (at cost)+50 more cores and it will happily use 99% of them. I've never encountered this across any linux or win10 platform since last august.
But I've also never touched win11 to even know/see this issue and I'm constantly monitoring the task manager and nvidia-sli info and stuff.
Hope this helps narrow down a few things. [note I'm running the bleeding edge diffuser/torch + python 3.10.0/3.12.0 side by side install for versions.]
I tried it in another machine with windows 10. I started Comfyui with "run_cpu.bat" file and just used the default workflow. Without any modification in the code, the CPU utilization was about 64% and after adjusting the code the CPU utilization was about 100%.
Well not sure what to say, was just throwing in my 2 cents that I've never seen this before across my machines that's all.
Clearly your env has something different, or the setup for whatever ai you are using is different. It's possible what [RandomGitUser321] said above about python being the culprit here in the end less windows if it is doing the same.