nnDetection [Question] Code adaptation help

:question: Question

Hello, unfortunately when running nndet_train, I realized that I didn't have enough memory, according to figure 1.

So I tried to delete some files from the directory: ${det_data}/Task000D3_Example/preprocessed/D3V001_3d/imagesTr. Then, when trying to run nndet_train again, I got the error in figure 2, so I realized that nnDetection needs the complete base.

However, as I have limited access to memory, I would like to know if it is possible, and if so, where can I change the amount of files needed to run the training.

Summary:

Doubt 1: The files that are consumed by the training step are actually in this directory: ${det_data}/Task000D3_Example/preprocessed/D3V001_3d/imagesTr ?
Doubt 2: Where can I change the code so that I can finish the training step without errors, given the amount of memory available? If possible, if you have any suggestions for changes to the code. I imagined that a possible solution applies to the stretch of line 103 to 108 here.

Figure 1

Figure 2

Aug 05 '22 13:08 aldemirfilho

Dear @aldemirfilho ,

the error indicates insufficient memory on the GPU, so deleting case files won't solve the Issue.

The error you posted is generated because the split was automatically generated from the first training was started and some cases are now missing after deleting them.

Best, Michael

Aug 09 '22 09:08 mibaumgartner

Thank you, Michael!

Even knowing that this can lead to a drop in the quality of the results, what adaptation can be made so that I can finish the training and visualize the results?

I'm working in college with limited resources and would need to get some results.

So, is it possible to adapt the code to reduce the memory needed?

Aug 09 '22 10:08 aldemirfilho

Dear @aldemirfilho ,

I think the best way to downscale nnDetection when a limited amount of memory is available is the reduction of the number of channels. This can be done by exchanging some lines in the config file https://github.com/MIC-DKFZ/nnDetection/blob/main/nndet/conf/train/v001.yaml :

for example

plan_arch_overwrites: # overwrite arguments of architecture
  start_channels: 16
  fpn_channels: 64
  head_channels: 64

Best, Michael

Aug 11 '22 08:08 mibaumgartner

Since there was no update for some time I’ll close this Issue for now. Please feel free to reopen this Issue if the problem persists or open a new one.

Jan 12 '23 12:01 mibaumgartner