MiB Different result in disjoint 15-5s setting.

Hello, I am trying to reproduce disjoint 15-5s setting. But my result is very different from yours.

My command is : /home/nayoung/nayoung/MiB/run.py --data_root '/home/nayoung/nayoung/' --batch_size 10 --dataset voc --name MIB --task 15-5s --step 0 --lr 0.01 --epochs 30 --method MiB for step1~5 : /home/nayoung/nayoung/MiB/run.py --data_root '/home/nayoung/nayoung/' --batch_size 10 --dataset voc --name MIB --task 15-5s --step 5 --lr 0.001 --epochs 30 --method MiB

I used batch size 10 becuz of cuda memory, and I didn't used the pretrained model. Also I set the loss_kd=100.

background	aeroplane	bicycle	bird	boat	bottle	bus	car	cat	chair	cow	diningtable	dog	horse	motorbike	person	pottedplant	sheep	sofa	train	tvmonitor
0.857241	0.596404	0.249615	0.489829	0.336007	0.254114	0.694971	0.631736	0.539938	0.124421	0.380302	0.230107	0.470491	0.438303	0.5446194	0.615698
0.822046	0.571647	0.246822	0.475479	0.322084	0.202237	0.607911	0.599167	0.515914	0.123029	0.300344	0.230606	0.447299	0.413728	0.5464546	0.603189	0.06315
0.812532	0.53895	0.238265	0.418296	0.236745	0.17652	0.540683	0.536196	0.477998	0.089279	0.28096	0.100062	0.383524	0.36603	0.5146568	0.589685	0.056601	0.065537
0.523217	0.503853	0.216371	0.287688	0.198159	0.151194	0.494373	0.503627	0.455402	0.093359	0.119011	0.123516	0.33748	0.289346	0.5154741	0.565243	0.049754	0.061803	0.035291
0.424163	0.464728	0.215501	0.285088	0.162308	0.139302	0.465628	0.475487	0.407798	0.062629	0.131808	0.035045	0.331196	0.272611	0.4768657	0.551702	0.04413	0.06458	0.030589	0.110248
0.303423	0.404756	0.196714	0.210973	0.101944	0.115709	0.366374	0.38747	0.39362	0.044943	0.073729	0.031481	0.310951	0.23618	0.4594278	0.545644	0.04088	0.061531	0.026771	0.092094	0.020551

class mIoU	0.51339	0.227215	0.361226	0.226208	0.173179	0.528324	0.522281	0.465112	0.08961	0.214359	0.125136	0.380157	0.336033	0.5095831	0.578527	0.050903	0.063363	0.030884	0.101171	0.020551

1-15 : 0.350022 16-20 : 0.053374 all : 0.27586

Oct 04 '23 02:10 kona419

Hey! Probably you get different results due to a different batch size... This setting is particularly challenging due to the non-i.i.d. data, so probably decreasing the BS hampers the performances.

Oct 16 '23 13:10 fcdl94

Hey! Probably you get different results due to a different batch size... This setting is particularly challenging due to the non-i.i.d. data, so probably decreasing the BS hampers the performances.

Thank you for the reply. Because of my GPU memory, my batch size has a limit. So could you recommend other hyperparameters(ex. learning rate, weight decay) for the low batch size to follow the paper's result?

Oct 31 '23 03:10 kona419

I actually never tried with a lower batch size. The main issue is using a low batch size in the 15-1 increasing the non i.i.d.-ness of data (you may try to use Batch Renormalization in place of BN as in my https://arxiv.org/abs/2012.01415, but it may alter - in positive - the results).

As a rule of thumb, you may double the iteration and halve the learning rate but I won't guarantee it will work.

Oct 31 '23 08:10 fcdl94

I actually never tried with a lower batch size. The main issue is using a low batch size in the 15-1 increasing the non i.i.d.-ness of data (you may try to use Batch Renormalization in place of BN as in my https://arxiv.org/abs/2012.01415, but it may alter - in positive - the results).

As a rule of thumb, you may double the iteration and halve the learning rate but I won't guarantee it will work.

Thank you for sharing~! I will try with your recommendation.

Nov 01 '23 12:11 kona419