Tongcheng Li comments

Results 14 comments of


                                            Tongcheng Li

The amount of parameters

Hello @shamangary , regarding the memory cost of feature maps, currently we have a Caffe implementation which trys to address the memory hungry problem (listed under much more spatial efficient...

Will you release the pre-train models on Caffe

@liuzhuang13 @shicai I think there might be differences in the EMA procedure of BN between torch and BN. The default cudnn-torch BN have momentum parameter = 0.1 for EMA of...

could you share your script for producing these protos and results of each model?

@jiangxuehan Hi! Actually because in my implementation of the model I can specify an entire DenseBlock (tens of transitions) as one layer, so the entire DenseBlock was manually created by...

could you share your script for producing these protos and results of each model?

@jiangxuehan Thanks for pointing out! I currently have the same result, which is about 0.8% lower than torch counterpart. This is actually a known issue: https://github.com/liuzhuang13/DenseNet/issues/10 . In my caffe,...

could you share your script for producing these protos and results of each model?

@jiangxuehan It turns out caffe's datalayer is feeding data without permutation, now I add a flag to permute the data, which turns the accuracy to 95.2%

could you share your script for producing these protos and results of each model?

@jiangxuehan Currently I have no definitive conclusion of the remaining 0.3% divergence, but there are several hypothesis: (1) Source of randomness: besides different random seed, one additional source of randomness...

could you share your script for producing these protos and results of each model?

@jiangxuehan Also, I think my datalayer with random option should be superior than the default imageDataLayer implementation because ImageDataLayer did the shuffling on a vector of Datum, which are quite...

could you share your script for producing these protos and results of each model?

Hi @John1231983 , the torch version's use cudnn version of BatchNormalization, which already includes the scale layer in the function, so in my version of caffe's modified BatchNorm, there is...

BatchNorm with CUDNN doesn't take effect, so there are no scale factors in BatchNorm layers?

Hello @WenzhMicrosoft , I am not sure what is the question we are having, but my understanding is that during layer initialization, it only reads the configuration from NeuralNetwork's proto...

Lack of two hyper-parameters for 'DenseNet-BC' support?

Hello @GuohongWu , good question: for DenseNet-C, it is coded as a ConvolutionLayer in .prototxt whose numOutput is smaller. For DenseNet-B, we implicitly assume that the bottleneck channel = 4*growthRate,...