memorization-AT cifar100

Can you provide the optimal parameter settings on the CIFAR100 dataset？ I used the following parameters but did not achieve the best results:

python train_te.py --dataset cifar100 --reg-weight 30

Dec 19 '24 08:12 Onsterm327

Sorry I forgot the parameter for reg-weight. After a quick glance of the code, I think it should be 300 instead of 30 since there are 100 classes on CIFAR-100. Please try it and check whether the problem is solve.

Dec 19 '24 11:12 dongyp13

Sorry I forgot the parameter for reg-weight. After a quick glance of the code, I think it should be 300 instead of 30 since there are 100 classes on CIFAR-100. Please try it and check whether the problem is solve.

Thank you very much for your reply. We conducted another experiment with the following settings:

python train_te.py --dataset cifar100

All namespaces are displayed as follows:

Namespace(batch_size=128, beta=6.0, data_augmentation=True, dataset='cifar100', depth=34, end_es=150, epochs=200, epsilon=0.03137254901960784, log_interval=100, loss_type='cross_entropy', lr=0.1, lr_milestones=[100, 150], lr_policy='step', model='ResNet18', model_dir='./model-cifar100-resnet', momentum=0.9, no_cuda=False, norm='linf', num_steps=10, reg_weight=300, save_freq=50, seed=1, start_es=90, step_size=0.00784313725490196, te_alpha=0.9, test_batch_size=200, weight_decay=0.0005, widen_factor=10)

The final experimental results on CIFAR-100 are:

================================================================
Train Epoch: 200	reg_weight: 300.0000
Train Epoch: 200 [0/50000 (0%)]	Loss: 0.857647
Train Epoch: 200 [12800/50000 (26%)]	Loss: 0.840502
Train Epoch: 200 [25600/50000 (51%)]	Loss: 1.001966
Train Epoch: 200 [38400/50000 (77%)]	Loss: 1.236528
================================================================
Test: Average loss: 1.7340, Accuracy: 5779/10000 (58%)
Test PGD: Average loss: 4.6706, Accuracy: 2245/10000 (22%)
================================================================

Although our final test results show a certain gap compared to those in the original paper, the experimental results on the CIFAR-10 dataset are consistent. Therefore, we believe these default settings may not be optimal for the CIFAR-100 dataset. We look forward to your reply.

Dec 19 '24 14:12 Onsterm327

Thank you again for your reply. I have found the final solution.

The issue arose because I mistakenly thought that the reg-weight in the code referred to the w parameter in the paper (which has a value of 30). However, the code actually calculates the mean for the memory part instead of the batch mean. Therefore, the reg-weight should be equal to w * num_classes. Thus, the optimal setting for CIFAR-10 is 30 * 10, and for CIFAR-100, the optimal setting is 30 * 100.

Wishing you success in your work!

Dec 20 '24 07:12 Onsterm327