deJQK

Results 24 comments of deJQK

I encounter the same problem when using some custom module with some parameter say `self.paramA` and the forward function including `input = torch.where(cond, self.paramA, input)`, and I definitely included `model.train()`...

Hi @youjinChung, thanks for your interest in our work. You could try to check [this function](https://github.com/snap-research/CAT/blob/1dbd048cc91e3cc2c59d4e4f0434e79ac260e7ed/distillers/base_inception_distiller.py#L342-L353). I am not sure how did you specify the `restore_student_G_path`, which is the student...

Thanks a lot. Sorry I am not very familiar with git or md so might not be able to help. Sorry for this.

Thanks for your interests in our work. Could you please try to use [3, 4, 5] to see if there is still this issue? Also, what is the performance of...

Hi @Ahmad-Jarrar , sorry for this, the quantization scheme proposed in the paper does not converge for low bits, and some modification is necessary. I remembered I posted this... For...

Hi @Ahmad-Jarrar , I have updated the [readme](https://github.com/deJQK/AdaBits#centered-weight-quantization-for-low-precision). Hope it is clear. Thanks again for your interest in our work.

> If I'm not wrong, the code given does not apply the outermost 2x-1. https://github.com/deJQK/AdaBits/blob/master/models/quant_ops.py#L142-L143

Hi @haiduo , you could check these papers: https://arxiv.org/pdf/1502.01852.pdf, https://arxiv.org/pdf/1606.05340.pdf, https://arxiv.org/pdf/1611.01232.pdf, all of which analyze training dynamics for centered weight. I am not sure how to analyze weights with nonzero...

Hi @haiduo, thanks again for your interest. For b=4, it maps [-1, 1] to [0, 1], to {0, 1, ..., 15}, to {0.5, 1.5, ..., 15.5}, to {1/32, 3/32, ...,...