Stark icon indicating copy to clipboard operation
Stark copied to clipboard

how many cores/threads do you use every task/PID/ nproc?

Open ANdong-star opened this issue 4 years ago • 1 comments

Hi,I use 8 32g tesla V100 and want to train your model. but the fps is lower than yours. Do you know how many cores/threads do you use every task/PID/ nproc? Thanks!

ANdong-star avatar Oct 08 '21 04:10 ANdong-star

Do you solve the problem? I meet the same issue too.

I have 8 Tesla V100 16G gpus. I tried to train Stark-ST101 for GOT-10K using the default setting i.e. baseline_R101_got10k_only.

python tracking/train.py --script stark_st1 --config baseline_R101_got10k_only --save_dir . --mode multiple --nproc_per_node 8  
python tracking/train.py --script stark_st2 --config baseline_R101_got10k_only --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline_R101_got10k_only

It takes about 2 hours for one epoch which is quite slow. ST1 has 500 epochs which means the whole training process will take more than 30 days, it is unacceptable.

iminfine avatar Nov 19 '21 08:11 iminfine