deJQK comments

Results 24 comments of


                                            deJQK

RuntimeError: expected scalar type Half but found Float

I encounter the same problem when using some custom module with some parameter say `self.paramA` and the forward function including `input = torch.where(cond, self.paramA, input)`, and I definitely included `model.train()`...

`restore` options to resume `distill.py`

Hi @youjinChung, thanks for your interest in our work. You could try to check [this function](https://github.com/snap-research/CAT/blob/1dbd048cc91e3cc2c59d4e4f0434e79ac260e7ed/distillers/base_inception_distiller.py#L342-L353). I am not sure how did you specify the `restore_student_G_path`, which is the student...

Will be better if images on the git page also include code link

Thanks a lot. Sorry I am not very familiar with git or md so might not be able to help. Sorry for this.

Bad performance on CIFAR using on low bit width

Thanks for your interests in our work. Could you please try to use [3, 4, 5] to see if there is still this issue? Also, what is the performance of...

Bad performance on CIFAR using on low bit width

Hi @Ahmad-Jarrar , sorry for this, the quantization scheme proposed in the paper does not converge for low bits, and some modification is necessary. I remembered I posted this... For...

Bad performance on CIFAR using on low bit width

Hi @Ahmad-Jarrar , I have updated the [readme](https://github.com/deJQK/AdaBits#centered-weight-quantization-for-low-precision). Hope it is clear. Thanks again for your interest in our work.

Bad performance on CIFAR using on low bit width

> If I'm not wrong, the code given does not apply the outermost 2x-1. https://github.com/deJQK/AdaBits/blob/master/models/quant_ops.py#L142-L143

Bad performance on CIFAR using on low bit width

Hi @haiduo , you could check these papers: https://arxiv.org/pdf/1502.01852.pdf, https://arxiv.org/pdf/1606.05340.pdf, https://arxiv.org/pdf/1611.01232.pdf, all of which analyze training dynamics for centered weight. I am not sure how to analyze weights with nonzero...

Bad performance on CIFAR using on low bit width

Hi @haiduo, thanks again for your interest. For b=4, it maps [-1, 1] to [0, 1], to {0, 1, ..., 15}, to {0.5, 1.5, ..., 15.5}, to {1/32, 3/32, ...,...

Bad performance on CIFAR using on low bit width

@haiduo, yes for both.