ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

CPU cores are not fully utilized when using standalone build for Windows (torch)

Open Moheeb77 opened this issue 1 year ago • 7 comments

Feature Idea

When running comfyui standalone build for windows cpu version, the utilization of all cpu cores are not implemented for torch. Adding few lines of code to ComfyUI/comfy/utils.py or main.py will help a lot of users who don't have GPU to fully utilize comfyui.

First option to just add the below lines without checking if gpu is used import os ####### cpuCount = os.cpu_count() ########### torch.set_num_threads(cpuCount) ############ torch.set_num_interop_threads(cpuCount) ############

or adding the below lines to main.py line 82 or any other location you think it is better. if args.windows_standalone_build: try: import fix_torch if args.cpu: ########### import torch ############# cpuCount = os.cpu_count() ########### torch.set_num_threads(cpuCount) ############ torch.set_num_interop_threads(cpuCount) ############

Also adding this option in --use-split-cross-attention in the run_cpu.bat

.\python_embeded\python.exe -s ComfyUI\main.py --cpu --windows-standalone-build --use-split-cross-attention

pause

Existing Solutions

No response

Other

No response

Moheeb77 avatar Oct 07 '24 06:10 Moheeb77

I tried applying this code, but there was absolutely no performance improvement, at least on Ryzen + Linux. For an SD1.5 512x512 20-steps image, it took exactly the same 1m 7s.

Interestingly, when the code was applied, the 'top' command showed that the CPU was being used more intensively than before application.

Is there a performance improvement on Windows?

ltdrdata avatar Oct 07 '24 07:10 ltdrdata

without increasing the number of cores in the code, my machine (windows 11) only utilizes one logical processor out of 20 whenever torch is utilized in the process and it takes a long time to finish small workflow. But now the cpu is fully utilized to 100% and it take very little time to finish. My cpu is 12th Gen Intel Core i7-12700. it has 12 cores and 20 logical processors

Moheeb77 avatar Oct 07 '24 08:10 Moheeb77

Not sure about Linux, but with Windows 11, it's a Python issue. If you run Python as admin, it will use all cores and without needing to add any extra code or launch flags. If you're using a shortcut to launch ComfyUI, you'll probably need to modify since launching as admin changes a few things with how Windows opens command prompts/terminals/etc.

I had to edit mine to look like:

g:
cd ComfyUI
.\python_embeded\python.exe -s .\ComfyUI\main.py --windows-standalone-build
pause

Where I manually pathed to my comfy folder before launching it, even though the shortcut's working directory is set correctly. (I think the command prompt that gets used when launching with admin is probably just launched in the windows\system32 folder, so that's why you need to manually shift the working directory to launch comfy correctly)

I had this issue with my CPU since I have a 13600kf. Python was always exclusively using only my E-cores and doing the above allowed it to use all 20 threads of my CPU when needed.

RandomGitUser321 avatar Oct 07 '24 12:10 RandomGitUser321

One more thing for windows 11 users. CPU utilization will also drop when the command prompt used to start comfyiu is not in focus. To avoid this problem run the below command as admin. Make sure to change the path ("D:\ai\python_embeded\python.exe") to where embedded python is located

C:\Windows\System32>powercfg /powerthrottling disable /path "D:\ai\python_embeded\python.exe"

Moheeb77 avatar Oct 10 '24 05:10 Moheeb77

This seems more like a win11 issue tbh, on all my win10 servers running various ai's it caps the cpu core when using, to the point where I can enable (at cost)+50 more cores and it will happily use 99% of them. I've never encountered this across any linux or win10 platform since last august.

But I've also never touched win11 to even know/see this issue and I'm constantly monitoring the task manager and nvidia-sli info and stuff.

Hope this helps narrow down a few things. [note I'm running the bleeding edge diffuser/torch + python 3.10.0/3.12.0 side by side install for versions.]

KrakeyMTL avatar Oct 15 '24 16:10 KrakeyMTL

I tried it in another machine with windows 10. I started Comfyui with "run_cpu.bat" file and just used the default workflow. Without any modification in the code, the CPU utilization was about 64% and after adjusting the code the CPU utilization was about 100%. CPU2 CPU1

Moheeb77 avatar Oct 15 '24 17:10 Moheeb77

Well not sure what to say, was just throwing in my 2 cents that I've never seen this before across my machines that's all.

Clearly your env has something different, or the setup for whatever ai you are using is different. It's possible what [RandomGitUser321] said above about python being the culprit here in the end less windows if it is doing the same.

KrakeyMTL avatar Oct 15 '24 17:10 KrakeyMTL