ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

OOM when torch complie added into wf

Open slmonker opened this issue 3 months ago • 14 comments

Custom Node Testing

Expected Behavior

Fix the issue when torch complie added into Wan2.2 native workflow

Actual Behavior

Image Image Image

Steps to Reproduce

After updating to version 0.3.65, the native Comfy workflow of wan2.2 will encounter OOM (Out of Memory) whenever torch compile is added. This issue occurs both with the torch compile in kjnodes and the native torch compile of ComfyUI

Debug Logs

oom

Other

No response

slmonker avatar Oct 14 '25 16:10 slmonker

GPU info

  • RTX 5090d
  • 32g
Image

ComfyUI 0.3.65

Image

ComfyUI 0.3.64

Image Image

comfyui-wiki avatar Oct 14 '25 16:10 comfyui-wiki

After git checkout to commit 27ffd12c45d4237338fe8789779313db9bab59f1 (One commit before the WAN2.2: Fix cache VRAM leak on error commit)

Image

The OOM issue no longer occurs. So, it seems this change is causing this issue.

Image

comfyui-wiki avatar Oct 14 '25 16:10 comfyui-wiki

Has this problem been resolved? I had the same problem around the same time and still get OOM in v0.3.67.

Using gguf and native comfyui workflow. when using torch complie, OOM or uses almost twice VRAM with shared GPU memory.

pytorch 2.7 and 2.9 cu128

m8rr avatar Nov 05 '25 01:11 m8rr

In v0.3.68, I got the following error instead of OOM. Next, I lowered the length from 81 to 61 and ran it, but got the same error (it succeeded in previous versions). I pressed Run again and it ran without a problem. I increased the length to 81 again and ran it without a problem. The overall execution time was significantly faster than previous versions.

pytorch 2.7 cu128

`Requested to load WAN21 0 models unloaded. loaded partially; 128.00 MB usable, 123.96 MB loaded, 8349.94 MB offloaded, lowvram patches: 0 0%| | 0/2 [00:03<?, ?it/s] !!! Exception during processing !!! AttributeError: 'UserDefinedObjectVariable' object has no attribute 'proxy'

from user code: File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 121, in torch_dynamo_resume_in_cast_bias_weight_at_113 return weight, bias, offload_stream

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

Traceback (most recent call last): File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 510, in execute output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 324, in get_output_data return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 298, in _async_map_node_over_list await process_inputs(input_dict, i) File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 286, in process_inputs result = f(**inputs) ^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1525, in sample return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1492, in common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\sample.py", line 60, in sample samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-TiledDiffusion\utils.py", line 51, in KSampler_sample return orig_fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1163, in sample return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1053, in sample return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1035, in sample output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 112, in execute return self.original(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 997, in outer_sample output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 980, in inner_sample samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 112, in execute return self.original(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-TiledDiffusion\utils.py", line 34, in KSAMPLER_sample return orig_fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 752, in sample samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 939, in sample_dpmpp_3m_sde_gpu return sample_dpmpp_3m_sde(model, x, sigmas, extra_args=extra_args, callback=callback, disable=disable, eta=eta, s_noise=s_noise, noise_sampler=noise_sampler) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 891, in sample_dpmpp_3m_sde denoised = model(x, sigmas[i] * s_in, **extra_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 401, in call out = self.inner_model(x, sigma, model_options=model_options, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 953, in call return self.outer_predict_noise(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 960, in outer_predict_noise ).execute(x, timestep, model_options, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 112, in execute return self.original(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 963, in predict_noise return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 381, in sampling_function out = calc_cond_batch(model, conds, x, timestep, model_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 206, in calc_cond_batch return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 214, in _calc_cond_batch_outer return executor.execute(model, conds, x_in, timestep, model_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 112, in execute return self.original(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 326, in calc_cond_batch output = model.apply_model(input_x, timestep, **c).chunk(batch_chunks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 161, in apply_model return comfy.patcher_extension.WrapperExecutor.new_class_executor( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 113, in execute return self.wrappers[self.idx](self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy_api\torch_helpers\torch_compile.py", line 26, in apply_torch_compile_wrapper return executor(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 105, in call return new_executor.execute(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 112, in execute return self.original(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 203, in _apply_model model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 655, in _fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ldm\wan\model.py", line 626, in forward return comfy.patcher_extension.WrapperExecutor.new_class_executor( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 112, in execute return self.original(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ldm\wan\model.py", line 646, in _forward return self.forward_orig(x, timestep, context, clip_fea=clip_fea, freqs=freqs, transformer_options=transformer_options, **kwargs)[:, :, :t, :h, :w] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ldm\wan\model.py", line 546, in forward_orig e = self.time_embedding( ^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\container.py", line 240, in forward input = module(input) ^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 158, in forward return self.forward_comfy_cast_weights(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-GGUF\ops.py", line 217, in forward_comfy_cast_weights out = super().forward_comfy_cast_weights(input, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 150, in forward_comfy_cast_weights weight, bias, offload_stream = cast_bias_weight(self, input, offloadable=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 113, in cast_bias_weight weight = weight.to(dtype=dtype) File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 1432, in call return self._torchdynamo_orig_callable( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 1213, in call result = self._inner_convert( ^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 598, in call return _compile( ^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 1110, in _compile raise InternalTorchDynamoError( File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 1059, in _compile guarded_code = compile_inner(code, one_graph, hooks, transform) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_utils_internal.py", line 97, in wrapper_function return function(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 761, in compile_inner return _compile_inner(code, one_graph, hooks, transform) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 797, in _compile_inner out_code = transform_code_object(code, transform) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\bytecode_transformation.py", line 1422, in transform_code_object transformations(instructions, code_options) File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 257, in _fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\convert_frame.py", line 715, in transform tracer.run() File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\symbolic_convert.py", line 3498, in run super().run() File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\symbolic_convert.py", line 1337, in run while self.step(): ^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\symbolic_convert.py", line 1246, in step self.dispatch_table[inst.opcode](self, inst) File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\symbolic_convert.py", line 3699, in RETURN_VALUE self._return(inst) File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\symbolic_convert.py", line 3672, in _return and not self.symbolic_locals_contain_module_class() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\symbolic_convert.py", line 3643, in symbolic_locals_contain_module_class if isinstance(v, UserDefinedClassVariable) and issubclass( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\variables\base.py", line 218, in instancecheck instance = instance.realize() ^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\variables\lazy.py", line 67, in realize self._cache.realize() File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\variables\lazy.py", line 33, in realize self.vt = VariableTracker.build(tx, self.value, source) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\variables\base.py", line 540, in build return builder.VariableBuilder(tx, source)(value) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\variables\builder.py", line 417, in call vt = self._wrap(value) ^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\variables\builder.py", line 693, in _wrap return self.wrap_module(value) ^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\variables\builder.py", line 1586, in wrap_module self.mark_static_input(p, guard=freezing) File "D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\variables\builder.py", line 1512, in mark_static_input var.proxy.node.meta["tensor_dict"]["_dynamo_static_input_type"] = ( ^^^^^^^^^ torch._dynamo.exc.InternalTorchDynamoError: AttributeError: 'UserDefinedObjectVariable' object has no attribute 'proxy'

from user code: File "D:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 121, in torch_dynamo_resume_in_cast_bias_weight_at_113 return weight, bias, offload_stream

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

Prompt executed in 87.47 seconds`

m8rr avatar Nov 05 '25 02:11 m8rr

For torch compile only the latest pytorch version is supported.

comfyanonymous avatar Nov 05 '25 05:11 comfyanonymous

This definitly still an issue and only can be resolved when revert back to v0.3.63 (yes as long as move to >=.64, issue happen) I had tested serval times, so 100% sure this is the behavior. My env looks like below: Total VRAM 32607 MB, total RAM 128616 MB pytorch version: 2.8.0+cu129 Enabled fp16 accumulation. Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 5090 : cudaMallocAsync Using sage attention Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr 8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)] ComfyUI version: 0.3.63 ComfyUI frontend version: 1.27.7

With same workflow and same generation paramater, .63 can handle it well in both KJ torch complie or Navita and only use up to 80% VRAM However, .64 will use out all VRAM and directly throw OOM

This must be some mechanism broken. Waiting for a fix, right now can only stay on .63

AsadaKintoki avatar Nov 09 '25 01:11 AsadaKintoki

Portable version now includes 2.9, so it's probably only guaranteed to work with 2.9. So let's use 2.9. It works fine.

m8rr avatar Nov 09 '25 01:11 m8rr

Portable version now includes 2.9, so it's probably only guaranteed to work with 2.9. So let's use 2.9. It works fine.

Thanks for the replay, I tried update everything to latest but getting some wired error now below is current setup and error

Total VRAM 32607 MB, total RAM 128616 MB pytorch version: 2.9.0+cu130 WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.7.0+cu128 with CUDA 1208 (you have 2.9.0+cu130) Python 3.12.10 (you have 3.12.10) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details Enabled fp16 accumulation. Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 5090 : cudaMallocAsync working around nvidia conv3d memory bug. Using sage attention Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr 8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)] ComfyUI version: 0.3.68 ComfyUI frontend version: 1.28.8

ComfyUI Error Report Error Details Node ID: 86 Node Type: KSamplerAdvanced Exception Type: torch._dynamo.exc.InternalTorchDynamoError Exception Message: CppCompileError: C++ compile error Command: cl /I E:/ComfyUI/python/Include /I E:/ComfyUI/python/Lib/site-packages/torch/include /I E:/ComfyUI/python/Lib/site-packages/torch/include/torch/csrc/api/include /D NOMINMAX /D TORCH_INDUCTOR_CPP_WRAPPER /D STANDALONE_TORCH_HEADER /D C10_USING_CUSTOM_GENERATED_MACROS /O2 /DLL /MD /std:c++20 /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc /Zc:__cplusplus /permissive- /openmp /openmp:experimental E:/ComfyUI/.cache/torchinductor/xi/cxihrirjih4koywoz5qlq44iqy4zqczg3xc7gudkvgmiugmmdigr.main.cpp /FeE:/ComfyUI/.cache/torchinductor/xi/cxihrirjih4koywoz5qlq44iqy4zqczg3xc7gudkvgmiugmmdigr.main.pyd /LD /link /LIBPATH:E:/ComfyUI/python/libs /LIBPATH:E:/ComfyUI/python/Lib/site-packages/torch/lib torch.lib torch_cpu.lib torch_python.lib sleef.lib c10.lib

Output: ?? x64 ? Microsoft (R) C/C++ ????? 19.43.34810 ? ????(C) Microsoft Corporation????????

cl: ??? warning D9025 :????"/openmp"(?"/openmp:experimental") cxihrirjih4koywoz5qlq44iqy4zqczg3xc7gudkvgmiugmmdigr.main.cpp E:/ComfyUI/.cache/torchinductor/xi/cxihrirjih4koywoz5qlq44iqy4zqczg3xc7gudkvgmiugmmdigr.main.cpp(2): fatal error C1083: ????????: "algorithm": No such file or directory

I guess the combination I pick for pytorch + triton + sage is not working fine with ComfyUI? Could you kindly share what's the combination you have in current env?

AsadaKintoki avatar Nov 09 '25 03:11 AsadaKintoki

comfyui portable 0.3.68 cu128 version, triton-windows v3.5.0-windows.post21, sageattention-2.2.0+cu128torch2.9.0andhigher.post4-cp39, Recently updated GGUF custom node.

MSVC's x64 Native Tools Command Prompt for VS 2022 Run it through. (To avoid errors where files such as cl.exe or algorithm cannot be found. There is an explanation in section 5. C compiler at https://github.com/woct0rdho/triton-windows.)

**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.14.19
** Copyright (c) 2025 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
Setting output directory to: E:\output
Checkpoint files will always be loaded safely.
Total VRAM 12282 MB, total RAM 32085 MB
pytorch version: 2.9.0+cu128
Enabled fp16 accumulation.
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4070 SUPER : cudaMallocAsync
working around nvidia conv3d memory bug.
Using sage attention
Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr  8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)]
ComfyUI version: 0.3.68
Setting temp directory to: E:\output\temp
ComfyUI frontend version: 1.28.8
[Prompt Server] web root: D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfyui_frontend_package\static
ComfyUI-GGUF: Allowing full torch compile

m8rr avatar Nov 09 '25 04:11 m8rr

comfyui portable 0.3.68 cu128 version, triton-windows v3.5.0-windows.post21, sageattention-2.2.0+cu128torch2.9.0andhigher.post4-cp39, Recently updated GGUF custom node.

MSVC's x64 Native Tools Command Prompt for VS 2022 Run it through. (To avoid errors where files such as cl.exe or algorithm cannot be found. There is an explanation in section 5. C compiler at https://github.com/woct0rdho/triton-windows.)

**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.14.19
** Copyright (c) 2025 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
Setting output directory to: E:\output
Checkpoint files will always be loaded safely.
Total VRAM 12282 MB, total RAM 32085 MB
pytorch version: 2.9.0+cu128
Enabled fp16 accumulation.
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4070 SUPER : cudaMallocAsync
working around nvidia conv3d memory bug.
Using sage attention
Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr  8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)]
ComfyUI version: 0.3.68
Setting temp directory to: E:\output\temp
ComfyUI frontend version: 1.28.8
[Prompt Server] web root: D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfyui_frontend_package\static
ComfyUI-GGUF: Allowing full torch compile

Thanks a lot! I confirm with this setup in leatest ComfyUI my workflow can be excuted without OOM. Also I'm wondering after change to pytorch 2.9.0 did you see performace better? I got almost same result on total time consume.

AsadaKintoki avatar Nov 09 '25 14:11 AsadaKintoki

comfyui portable 0.3.68 cu128 version, triton-windows v3.5.0-windows.post21, sageattention-2.2.0+cu128torch2.9.0andhigher.post4-cp39, Recently updated GGUF custom node.

MSVC's x64 Native Tools Command Prompt for VS 2022 Run it through. (To avoid errors where files such as cl.exe or algorithm cannot be found. There is an explanation in section 5. C compiler at https://github.com/woct0rdho/triton-windows.)

**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.14.19
** Copyright (c) 2025 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
Setting output directory to: E:\output
Checkpoint files will always be loaded safely.
Total VRAM 12282 MB, total RAM 32085 MB
pytorch version: 2.9.0+cu128
Enabled fp16 accumulation.
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4070 SUPER : cudaMallocAsync
working around nvidia conv3d memory bug.
Using sage attention
Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr  8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)]
ComfyUI version: 0.3.68
Setting temp directory to: E:\output\temp
ComfyUI frontend version: 1.28.8
[Prompt Server] web root: D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfyui_frontend_package\static
ComfyUI-GGUF: Allowing full torch compile

And / or maybe set the CC environment variable?

If you need to override the C compiler, you can set the environment variable CC. MSVC, GCC, and Clang are supported for the JIT compilation in Triton

jovan2009 avatar Nov 09 '25 15:11 jovan2009

Also I'm wondering after change to pytorch 2.9.0 did you see performace better? I got almost same result on total time consume.

Previously, I used 2.7, but torch compilation became much slower in 2.9. (comfyui native torchcomplie node) Is it because gguf and sage support full torch complie? I'm not sure.

Anyway, I think the speed has improved a little bit overall? At least it didn't get worse.

m8rr avatar Nov 09 '25 15:11 m8rr

Also I'm wondering after change to pytorch 2.9.0 did you see performace better? I got almost same result on total time consume.

Previously, I used 2.7, but torch compilation became much slower in 2.9. (comfyui native torchcomplie node) Is it because gguf and sage support full torch complie? I'm not sure.

Anyway, I think the speed has improved a little bit overall? At least it didn't get worse.

Agree, I also notic the init compile seem bit slower, but since just the warm up so not that critical. Overall performace when I try 1080x1080 with 129 frames it was 50sec/step vs 47 sec/step (2.8.0 - 2.9.0) so seems our obersvations matched. I think my issue sloved here, thanks agian for your help! :)

AsadaKintoki avatar Nov 09 '25 16:11 AsadaKintoki

comfyui portable 0.3.68 cu128 version, triton-windows v3.5.0-windows.post21, sageattention-2.2.0+cu128torch2.9.0andhigher.post4-cp39, Recently updated GGUF custom node. MSVC's x64 Native Tools Command Prompt for VS 2022 Run it through. (To avoid errors where files such as cl.exe or algorithm cannot be found. There is an explanation in section 5. C compiler at https://github.com/woct0rdho/triton-windows.)

**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.14.19
** Copyright (c) 2025 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
Setting output directory to: E:\output
Checkpoint files will always be loaded safely.
Total VRAM 12282 MB, total RAM 32085 MB
pytorch version: 2.9.0+cu128
Enabled fp16 accumulation.
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4070 SUPER : cudaMallocAsync
working around nvidia conv3d memory bug.
Using sage attention
Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr  8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)]
ComfyUI version: 0.3.68
Setting temp directory to: E:\output\temp
ComfyUI frontend version: 1.28.8
[Prompt Server] web root: D:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfyui_frontend_package\static
ComfyUI-GGUF: Allowing full torch compile

And / or maybe set the CC environment variable?

If you need to override the C compiler, you can set the environment variable CC. MSVC, GCC, and Clang are supported for the JIT compilation in Triton

Yes, there were some setup issues with my VS 2022 relatived kits, I redeployed and corrected the env variables.

AsadaKintoki avatar Nov 09 '25 16:11 AsadaKintoki