multinerf icon indicating copy to clipboard operation
multinerf copied to clipboard

Original error: UNIMPLEMENTED: DNN library is not found.

Open Cerf-Volant425 opened this issue 3 years ago • 2 comments

After configuring the environment, there are always errors as below when training and testing the LLFF dataset of the scene FLOWER. image image image

My setting is:

  • cuda: 11.1.74
  • tensorflow: 2.7.0
  • jaxlib: 0.4.1+cuda11.cudnn86

Can you give me some suggestions to avoid these errors, thanks in advance.

Cerf-Volant425 avatar Dec 20 '22 00:12 Cerf-Volant425

Firstly, try to use following commands export XLA_PYTHON_CLIENT_PREALLOCATE=false export XLA_FLAGS="--xla_gpu_strict_conv_algorithm_picker=false --xla_gpu_force_compilation_parallelism=1" If they do not works, try to reduce batch_size or upgrade your cudnn to a higher version >=8.6.0. Hope this will be helpful.

AuthorityWang avatar Dec 22 '22 07:12 AuthorityWang

In addition to above comments, in your error log, you have a mismatch between pre-compiled jaxlib cuDNN version and the cuDNN version you have installed (8.1.0 versus 8.6.0). See here for details in how to align the version of cuDNN.

Nevertheless, after all I end up with: INTERNAL: Failed to load in-memory CUBIN: CUDA_ERROR_OUT_OF_MEMORY: out of memory.

deeepwin avatar Apr 10 '23 08:04 deeepwin