Training Compact CNNs for Image Classification using Dynamic-coded Filter Fusion
.
Running Code
Requirements
- Pytorch >= 1.0.1
- CUDA = 10.0.0
- thop = 0.0.31
Run Our Results
CIFAR-10
#* vgg16 step FLOPs_PR=76.8% Params_PR=92.8%
python main_cifar.py \
--data_set 'cifar10' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 300 \
--lr_decay_step 150 225 \
--lr 0.1 \
--weight_decay 5e-4 \
--lr_type 'step' \
--momentum 0.9 \
--arch vgg16 \
--cprate '[0.5]*2+[0.4]*2+[0.35]*3+[0.85]*6' \
--job_dir 'EXP' \
--gpus 0
#* googlenet step FLOPs_PR=70.1% Params_PR=66.3%
python main_cifar.py \
--data_set 'cifar10' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 300 \
--lr_decay_step 150 225 \
--lr 0.1 \
--weight_decay 5e-4 \
--lr_type 'step' \
--momentum 0.9 \
--arch googlenet \
--cprate '[0.0]*2+[0.8]+[0.0]+[0.8]*2+[0.0]*2+ ([0.0]+[0.9]+[0.0]+[0.9]*2+[0.0]*2)*3+ ([0.0]+[0.8]+[0.0]+[0.8]*2+[0.0]*2)*3+ ([0.0]+[0.9]+[0.0]+[0.9]*2+[0.0]*2)*2' \
--job_dir 'EXP' \
--gpus 0
#* resnet56 step FLOPs_PR=55.9% Params_PR=55.0%
python main_cifar.py \
--data_set 'cifar10' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 300 \
--lr_decay_step 150 225 \
--lr 0.1 \
--weight_decay 5e-4 \
--lr_type 'step' \
--momentum 0.9 \
--arch resnet56 \
--cprate '[0.7]*2+[0.5]*3+[0.3]*2+[0.4]+[0.8]+ [0.7]*2+[0.8]*4+[0.4]+[0.2]*2+[0.7]+[0.3]+[0.8]+[0.4]*2+[0.7]+[0.3]+[0.4]+[0.8]+ [0.0]*3' \
--job_dir 'EXP' \
--gpus 0
#* resnet110 step FLOPs_PR=66.6% Params_PR=67.9%
python main_cifar.py \
--data_set 'cifar10' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 300 \
--lr_decay_step 150 225 \
--lr 0.1 \
--weight_decay 5e-4 \
--lr_type 'step' \
--momentum 0.9 \
--arch resnet110 \
--cprate '[0.2]+[0.0]+[0.2]+[0.3]+[0.7]*2+[0.1]+[0.3]*2+[0.4]+[0.7]*2+[0.5]+[0.1]+[0.3]+[0.0]+[0.6]+[0.0]+[0.2]+[0.5]+[0.0]+[0.7]*2+[0.5]+[0.7]*2+[0.4]*2+[0.0]+[0.3]+[0.1]+[0.5]+[0.1]*3+[0.7]+ [0.1]*2+[0.3]*5+[0.5]+[0.7]+[0.2]+[0.4]+[0.7]*5+[0.5]+[0.1]+ [0.6]+[0.2]+[0.5]' \
--job_dir 'EXP' \
--gpus 0
ImageNet
#* resnet50 step FLOPs_PR=76.7% Params_PR=71.0%
python main_imagenet.py \
--data_set 'imagenet' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 90 \
--lr_type 'step' \
--lr 0.1 \
--weight_decay 1e-4 \
--momentum 0.9 \
--arch resnet50 \
--cprate '[0.0]+[0.8]*10+[0.7]*6+[0.6]*10+[0.4]*6+[0.3]*4' \
--job_dir 'EXP' \
--gpus 0
#* resnet50 step FLOPs_PR=63.8% Params_PR=58.6%
python main_imagenet.py \
--data_set 'imagenet' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 90 \
--lr_type 'step' \
--lr 0.1 \
--weight_decay 1e-4 \
--momentum 0.9 \
--arch resnet50 \
--cprate '[0.0]+[0.6]*10+[0.5]*6+[0.5]*10+[0.4]*6+[0.2]*4' \
--job_dir 'EXP' \
--gpus 0
#* resnet50 step FLOPs_PR=45.3% Params_PR=40.7%
python main_imagenet.py \
--data_set 'imagenet' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 90 \
--lr_type 'step' \
--lr 0.1 \
--weight_decay 1e-4 \
--momentum 0.9 \
--arch resnet50 \
--cprate '[0.0]+[0.35]*10+[0.3]*6+[0.4]*10+[0.3]*6+[0.1]*4' \
--job_dir 'EXP' \
--gpus 0
#* resnet50 cos FLOPs_PR=63.0% Params_PR=56.8%
python main_imagenet.py \
--data_set 'imagenet' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 180 \
--lr_type 'cos' \
--lr 0.01 \
--weight_decay 1e-4 \
--momentum 0.99 \
--arch resnet50 \
--cprate '[0.0]+([0.5]*10+[0.5]*6)*2+[0.25]*3+[0.0]' \
--job_dir 'EXP' \
--gpus 0
#* resnet50 cos FLOPs_PR=66.7% Params_PR=53.8%
python main_imagenet.py \
--data_set 'imagenet' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 180 \
--lr_type 'cos' \
--lr 0.01 \
--weight_decay 1e-4 \
--momentum 0.99 \
--arch resnet50 \
--cprate '[0.0]+([0.7]*7+[0.45]*9)*2+[0.24]*3+[0.0]' \
--job_dir 'EXP' \
--gpus 0
#* resnet50 cos FLOPs_PR=75.1% Params_PR=74.3%
python main_imagenet.py \
--data_set 'imagenet' \
--data_path 'DATASET' \
--train_batch_size 256 \
--test_batch_size 256 \
--num_epochs 180 \
--lr_type 'cos' \
--lr 0.01 \
--weight_decay 1e-4 \
--momentum 0.99 \
--arch resnet50 \
--cprate '[0.0]+([0.43]*7+[0.73]*9)*2+[0.45]*3+[0.0]' \
--job_dir 'EXP' \
--gpus 0
Test our result
CIFAR-10
python test_compact_model.py \
--data_set 'cifar10' \
--data_path 'DATASET' \ # Input your data path of CIFAR-10 here
--test_batch_size 256 \
--arch [arch_name] \ # Input the corresponding network architecture here (vgg16/resnet56/resnet110/googlenet)
--cprate [cprate] \ # It can be found from the links in the following table
--resume_compact_model model_best_compact.pt \ # Input the pruned model path here. It can be downloaded from the links in the following table.
--gpus 0
| Full Model |
Flops(PR) |
Parameter(PR) |
lr_type |
Accuracy |
Model |
| VGG-16 |
72.77M (76.83%) |
1.06M (92.80%) |
step |
93.47% |
pruned |
| ResNet-56 |
55.84M (55.88%) |
0.38M (54.95%) |
step |
93.26% |
pruned |
| ResNet-110 |
85.30M (66.55%) |
0.56M (67.86%) |
step |
93.80% |
pruned |
| GoogLeNet |
457.22M (70.11%) |
2.08M (66.28%) |
step |
94.92% |
pruned |
ImageNet
python test_compact_model.py \
--data_set 'imagenet' \
--data_path 'DATASET' \ # Input your data path of ImageNet here
--test_batch_size 256 \
--arch [arch_name] \ # Input the corresponding network architecture here (resnet50)
--cprate [cprate] \ # It can be found from the links in the following table
--resume_compact_model model_best_compact.pt \ # Input the pruned model path here. It can be downloaded from the links in the following table.
--gpus 0
| Full Model |
Flops(PR) |
Parameter(PR) |
lr_type |
Top1-Accuracy |
Top5- Accuracy |
Model |
| ResNet-50 |
0.96B (76.70%) |
7.40M (71.03%) |
step |
71.54% |
90.57% |
pruned |
| ResNet-50 |
1.49B (63.75%) |
10.58M (58.60%) |
step |
74.21% |
91.93% |
pruned |
| ResNet-50 |
2.25B (45.30%) |
15.16M (40.67%) |
step |
75.18% |
92.56% |
pruned |
| ResNet-50 |
1.52B (62.96%) |
11.05M (56.77%) |
cos |
75.60% |
92.55% |
pruned |
| ResNet-50 |
1.38B (66.41%) |
11.81M (53.77%) |
cos |
74.85% |
92.41% |
pruned |
| ResNet-50 |
1.02B (75.11%) |
6.56M (74.33%) |
cos |
73.81% |
91.59% |
pruned |
| MobileNetV2 |
140M(53.3%) |
2.62M(25.1%) |
cos |
68.60% |
88.13% |
pruned |
| Model |
Top1-Accuracy |
Flops |
Parameter |
Link |
| DCFF $_{manually}$ |
71.54% |
960M |
7.40M |
pruned |
| DCFF $_{ABCPruner}$ |
72.19% |
945M |
7.35M |
pruned |
| DCFF $_{manually}$ |
74.21% |
1490M |
10.58M |
pruned |
| DCFF $_{ABCPruner}$ |
73.78% |
1295M |
9.10M |
pruned |
| DCFF $_{manually}$ |
74.21% |
1490M |
10.58M |
pruned |
| DCFF $_{ABCPruner}$ |
74.73% |
1794M |
11.24M |
pruned |
| DCFF $_{manually}$ |
75.18% |
2250M |
15.16M |
pruned |
| DCFF $_{ABCPruner}$ |
74.83% |
1891M |
11.75M |
pruned |
| DCFF $_{manually}$ |
75.18% |
2250M |
15.16M |
pruned |
| DCFF $_{ABCPruner}$ |
75.79% |
2556M |
18.02M |
pruned |
| DCFF $_{manually}$ |
71.54% |
960M |
7.40M |
pruned |
| DCFF $_{EagleEye}$ |
73.18% |
1040M |
6.99M |
pruned |
| DCFF $_{manually}$ |
75.18% |
2250M |
15.16M |
pruned |
| DCFF $_{EagleEye}$ |
75.60% |
2070M |
14.41M |
pruned |