drjit icon indicating copy to clipboard operation
drjit copied to clipboard

Weird variable leak issue

Open thomasw21 opened this issue 3 years ago • 8 comments

Priori to the following commit: https://github.com/mitsuba-renderer/drjit/commit/6cf418c0920201fb241c41a232e1697fdf334ac5

I'd get

drjit-autodiff: variable leak detected (2 variables remain in use)!
 - variable a8760 (1 references)
 - variable a8761 (1 references)

Now it has become a scarier error:

jit_shutdown(): detected variable leaks:
 - variable r57096 is still being referenced! (int_ref=0, ext_ref=1, se_ref=0, type=uint32, size=1, stmt="<null>", dep=[0, 0, 0, 0])
 - variable r52 is still being referenced! (int_ref=0, ext_ref=1, se_ref=0, type=uint32, size=1, stmt="<null>", dep=[0, 0, 0, 0])
 - variable r21 is still being referenced! (int_ref=0, ext_ref=1, se_ref=0, type=float32, size=95, stmt="$r0 = call <$w x $t0> @llvm.fma.v$w$a1(<$w x $t1> $r1, <$w x $t2> $r2, <$w x $t3> $r3)$[declare <$w x $t0> @llvm.fma.v$w$a1(<$w x $t1>, <$w x $t2>, <$w x $t3>)$]", dep=[18, 3, 15, 0])
 - variable r17838 is still being referenced! (int_ref=0, ext_ref=1, se_ref=0, type=float32, size=1, stmt="$r0 = fneg <$w x $t0> $r1", dep=[17426, 0, 0, 0])
 - variable r40 is still being referenced! (int_ref=0, ext_ref=9, se_ref=0, type=float32, size=1, stmt="<literal>", dep=[0, 0, 0, 0])
 - variable r2 is still being referenced! (int_ref=3, ext_ref=1, se_ref=0, type=float32, size=95, stmt="<null>", dep=[0, 0, 0, 0])
 - variable r20633 is still being referenced! (int_ref=0, ext_ref=2, se_ref=0, type=float32, size=110592, stmt="<null>", dep=[0, 0, 0, 0])
 - variable r57067 is still being referenced! (int_ref=0, ext_ref=1, se_ref=0, type=float32, size=4096, stmt="<null>", dep=[0, 0, 0, 0])
 - variable r57095 is still being referenced! (int_ref=0, ext_ref=1, se_ref=0, type=uint32, size=1, stmt="<null>", dep=[0, 0, 0, 0])
 - variable r10 is still being referenced! (int_ref=1, ext_ref=0, se_ref=0, type=float32, size=95, stmt="$r0 = fmul <$w x $t0> $r1, $r2", dep=[7, 1, 0, 0])
 - (skipping remainder)
jit_shutdown(): 68 variables are still referenced!
jit_registry_shutdown(): LLVM registry leaked 2 forward and 1 reverse mappings!
jit_registry_shutdown(): LLVM registry leaked 2 attributes!
jit_malloc_shutdown(): leaked
 - host-async memory: 4.097 MiB in 31 allocations
libc++abi: terminating with uncaught exception of type std::runtime_error: jit_init_thread_state(): the LLVM backend is inactive because the LLVM shared library ("libLLVM.dylib") could not be found! Set the DRJIT_LIBLLVM_PATH environment variable to specify its path.
[1]    18938 abort      python -m text_to_3d_mistuba.train --save-model $(pwd)/models/dummy_test  1

Should I be worried?

Unfortunately since I don't even know why I used to have a leak even before the commit, it's quite hard to obtain a small reproducible codebase.

Update

I reran the tutorial on NeRF (concatenating everything inside a single script, and getting the same error)

thomasw21 avatar Oct 08 '22 13:10 thomasw21

Yes this is a know issue, I am looking into this right now. Should be fixed soon.

Speierers avatar Oct 10 '22 06:10 Speierers

Did you find out why you were getting the error before the commit? I'm getting the same thing, I enabled gradient on a single scene parameter (that I reduced to using params.keep(r'...')), called mi.render(scene, params) and the output has gradients enabled, but if I do dr.backward on the output of mi.render it gives me the variable leak error

aditsharma-projects avatar Dec 03 '22 23:12 aditsharma-projects

Unfortunately no :S

thomasw21 avatar Dec 05 '22 18:12 thomasw21

I have same problem, I step by step finish PyTorch and Mitsuba interoperability case,but I found save to .ipynb,there no problem,but save to .py the problem produce. why happen these problem?It's been bothering me for a long time, anyone can give me some hints of the problem

the problem below

drjit-autodiff: variable leak detected (3 variables remain in use)!

  • variable a1 (1 references)
  • variable a2981 (1 references)
  • variable a5 (1 references)

xdobetter avatar Apr 01 '23 03:04 xdobetter

Does this happen when you're script finishes?

This is nothing to worry about. The Python shutdown sequence is rather complex and some Dr.Jit variables get freed in an undesired order, hence the warning message.

(It only happens in the Python script and not in the notebook as the notebook's kernel keeps running even once you've executed all cells.)

njroussel avatar Apr 03 '23 06:04 njroussel

Thank you for your quick reply again , made me less worried about it.

xdobetter avatar Apr 03 '23 08:04 xdobetter

Is there a particular module(s) that would be easy enough to manually del to make sure this error doesn't pop up? It causes a hangup on shutdown that interferes with how I want to script and forces me to kill the terminal to resolve

skyler14 avatar Sep 20 '23 01:09 skyler14

Hi @skyler14

This is the main module that you would want to isolate, I think: drjit.drjit_ext

njroussel avatar Sep 25 '23 11:09 njroussel