ComfyUI CUDA error: operation not supported, in KSampler

Running on Ubuntu 20.02

Error occurred when executing KSampler:

CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

File "/home/developer/ComfyUI/execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "/home/developer/ComfyUI/execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "/home/developer/ComfyUI/execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) File "/home/developer/ComfyUI/nodes.py", line 1206, in sample return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise) File "/home/developer/ComfyUI/nodes.py", line 1176, in common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, File "/home/developer/ComfyUI/comfy/sample.py", line 75, in sample comfy.model_management.load_model_gpu(model) File "/home/developer/ComfyUI/comfy/model_management.py", line 299, in load_model_gpu real_model.to(torch_dev) File "/home/developer/mambaforge/envs/SDWebUI/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1145, in to return self._apply(convert) File "/home/developer/mambaforge/envs/SDWebUI/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/home/developer/mambaforge/envs/SDWebUI/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/home/developer/mambaforge/envs/SDWebUI/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/home/developer/mambaforge/envs/SDWebUI/lib/python3.10/site-packages/torch/nn/modules/module.py", line 820, in _apply param_applied = fn(param) File "/home/developer/mambaforge/envs/SDWebUI/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1143, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

Aug 08 '23 01:08 phalexo

Refer to https://github.com/comfyanonymous/ComfyUI/issues/940 to use --disable-cuda-malloc

Aug 08 '23 07:08 haoqiangyu

Refer to #940 to use --disable-cuda-malloc

I also encountered the same problem. With this parameter added, there will be no error, but generating an image takes more than 1 hour. I think this is definitely abnormal

Oct 18 '23 07:10 qiufu2000

Error occurred when executing KSampler:

CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

File "/home/jk/ComfyUI/execution.py", line 155, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "/home/jk/ComfyUI/execution.py", line 85, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "/home/jk/ComfyUI/execution.py", line 78, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) File "/home/jk/ComfyUI/nodes.py", line 1355, in sample return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise) File "/home/jk/ComfyUI/nodes.py", line 1325, in common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, File "/home/jk/ComfyUI/comfy/sample.py", line 93, in sample real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask) File "/home/jk/ComfyUI/comfy/sample.py", line 86, in prepare_sampling comfy.model_management.load_models_gpu([model] + models, model.memory_required([noise_shape[0] * 2] + list(noise_shape[1:])) + inference_memory) File "/home/jk/ComfyUI/comfy/model_management.py", line 434, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) File "/home/jk/ComfyUI/comfy/model_management.py", line 301, in model_load raise e File "/home/jk/ComfyUI/comfy/model_management.py", line 297, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU File "/home/jk/ComfyUI/comfy/model_patcher.py", line 210, in patch_model self.model.to(device_to) File "/home/jk/miniconda3/envs/tk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1160, in to return self._apply(convert) File "/home/jk/miniconda3/envs/tk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply module._apply(fn) File "/home/jk/miniconda3/envs/tk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply module._apply(fn) File "/home/jk/miniconda3/envs/tk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply module._apply(fn) File "/home/jk/miniconda3/envs/tk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 833, in _apply param_applied = fn(param) File "/home/jk/miniconda3/envs/tk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1158, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) Queue size: 0 Extra options

Jan 18 '24 14:01 jakeytan

Go to "run_nvidia_gpu.bat" and edit it , add the argument --disable-cuda-malloc before --windows-standalone-build save it and run it, the GPU card will run on native. I hope this will solve your problem

Feb 02 '24 15:02 Gsirawan

Go to "run_nvidia_gpu.bat" and edit it , add the argument --disable-cuda-malloc before --windows-standalone-build save it and run it, the GPU card will run on native. I hope this will solve your problem

Hello! Thank you for your message, i had similair problem and this helped. But left me wondering, what means "the GPU card will run on native"? I'm curious what --disable-cuda-malloc does and is it reducing performance or smth.

This is my error from before i used --disable-cuda-malloc. Clean new setup of ComfyUI on a new machine.

-- Error occurred when executing CLIPTextEncode:

CUDA error: operation not supported Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\nodes.py", line 56, in encode cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 127, in encode_from_tokens self.load_model() File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 144, in load_model model_management.load_model_gpu(self.patcher) File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 440, in load_model_gpu return load_models_gpu([model]) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 434, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 301, in model_load raise e File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 297, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\model_patcher.py", line 210, in patch_model self.model.to(device_to) File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1152, in to return self._apply(convert) ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) [Previous line repeated 2 more times] File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 825, in _apply param_applied = fn(param) ^^^^^^^^^ File "C:\Users\AMT-WIN10\Comfyui\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1150, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Feb 05 '24 09:02 olegpars

求助 Error occurred when executing CLIPTextEncode:

CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

File "F:\Program Files\COMFYUI1.3G\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\nodes.py", line 58, in encode cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\sd.py", line 135, in encode_from_tokens self.load_model() File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\sd.py", line 155, in load_model model_management.load_model_gpu(self.patcher) File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_management.py", line 453, in load_model_gpu return load_models_gpu([model]) ^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_management.py", line 447, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_management.py", line 304, in model_load raise e File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_management.py", line 300, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to, patch_weights=load_weights) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_patcher.py", line 259, in patch_model self.model.to(device_to) File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1152, in to return self._apply(convert) ^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) [Previous line repeated 2 more times] File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 825, in _apply param_applied = fn(param) ^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1150, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

Apr 21 '24 08:04 paoxiaolang

求助下面的问题我重新安装WIN11 COMFYUI多次都没有解决，到了提示词卡了红色警告。求助怎么做 Error occurred when executing CLIPTextEncode:

CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

File "F:\Program Files\COMFYUI1.3G\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\nodes.py", line 58, in encode cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\sd.py", line 135, in encode_from_tokens self.load_model() File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\sd.py", line 155, in load_model model_management.load_model_gpu(self.patcher) File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_management.py", line 453, in load_model_gpu return load_models_gpu([model]) ^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_management.py", line 447, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_management.py", line 304, in model_load raise e File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_management.py", line 300, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to, patch_weights=load_weights) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\ComfyUI\comfy\model_patcher.py", line 259, in patch_model self.model.to(device_to) File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1152, in to return self._apply(convert) ^^^^^^^^^^^^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 802, in _apply module._apply(fn) [Previous line repeated 2 more times] File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 825, in _apply param_applied = fn(param) ^^^^^^^^^ File "F:\Program Files\COMFYUI1.3G\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1150, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

Apr 21 '24 08:04 paoxiaolang

I am using Tesla P40 card. Getting the same error. I am compiling using docker-compose.yaml:

# stable diffusion

  stable-diffusion-download:
    build: ./stable-diffusion-webui-docker/services/download/
    image: comfy-download
    environment:
      - PUID=${PUID:-1000}
      - PGID=${PGID:-1000}
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - ./stable-diffusion-webui-docker/data:/data

  stable-diffusion-webui:
    build: ./stable-diffusion-webui-docker/services/comfy/
    image: comfy-ui
    environment:
      - PUID=${PUID:-1000}
      - PGID=${PGID:-1000}
      - CLI_ARGS=
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - ./stable-diffusion-webui-docker/data:/data
      - ./stable-diffusion-webui-docker/output:/output

    stop_signal: SIGKILL
    tty: true
    deploy:
      resources:
        reservations:
          devices:
              - driver: nvidia
                device_ids: ['0']
                capabilities: [compute, utility]
    restart: unless-stopped
    networks:
      - traefik
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.stable-diffusion.rule=Host(`stable-diffusion.local.example.com`)"
      - "traefik.http.routers.stable-diffusion.entrypoints=https"
      - "traefik.http.routers.stable-diffusion.tls=true"
      - "traefik.http.routers.stable-diffusion.tls.certresolver=cloudflare"
      - "traefik.http.services.stable-diffusion.loadbalancer.server.port=7860"
      - "traefik.http.routers.stable-diffusion.middlewares=default-headers@file"

where do I set that parameter? I cloned your repo first before running the build command. PS I am using Ubuntu Server

Aug 15 '24 17:08 fahadshery

is this resolved? I am getting the same error

Aug 23 '24 14:08 fahadshery

is this resolved? I am getting the same error

The options mentioned above are CLI options that are inserted when executing python ComfyUI/main.py, and are not options set at the docker compose level.

Aug 23 '24 22:08 ltdrdata