deep-compression I have a question about parameter number.

after pruning, I count the parameter of the network resnet34. well... I found the number of parameters to be 4.2 * 10e7 I know parameter number of resnet34 is about 2 * 10e7 I think it's because of the mask of layer that the difference is about double, right? then how to remove the mask layer after prunning? Thanks you.

Jan 09 '20 12:01 5thwin

Sorry for the slow response.

The easiest way to do this would be to create a normal resnet34 (without masks), and then copy across the weights.

You'll want to do something like:

net = NormalResNet34()
masked_net = MaskedResNet34()

for normal, masked in zip([net.layer1, net.layer2, net.layer3, net.layer4], [masked_net.layer1, masked_net.layer2, masked_net.layer3, masked_net.layer4]):
    normal.conv1.weight = masked.conv1.weight
    normal.bn1.weight = masked.bn1.weight
    normal.conv2.weight  = masked.conv2.weight
    # etc.

Note that you're probably going to have to copy some biases as well (you can just do normal.bn1.bias).

It would be cool to have this as a feature in the code. If you figure it out, please feel free to open a pull request 😊

If not I'll get to implementing it soon.

Feb 02 '21 09:02 jack-willturner

Sorry for the slow response.

The easiest way to do this would be to create a normal resnet34 (without masks), and then copy across the weights.

You'll want to do something like:
net = NormalResNet34()
masked_net = MaskedResNet34()

for normal, masked in zip([net.layer1, net.layer2, net.layer3, net.layer4], [masked_net.layer1, masked_net.layer2, masked_net.layer3, masked_net.layer4]):
    normal.conv1.weight = masked.conv1.weight
    normal.bn1.weight = masked.bn1.weight
    normal.conv2.weight  = masked.conv2.weight
    # etc.
Note that you're probably going to have to copy some biases as well (you can just do normal.bn1.bias).

It would be cool to have this as a feature in the code. If you figure it out, please feel free to open a pull request

If not I'll get to implementing it soon.

Good idea, thank you. @jack-willturner

Using mask for pruning training is a very common way in pruning algorithm. I didn't want to extract effective parameters after mask training before. Your description gives me a specific implementation path (Although using this scheme needs to reimplement a model definition that can customize the number of parameters of each layer)

Jul 01 '21 06:07 zjykzj