BinaryNet BinaryNet in TensorFlow ?...

Hi Itay,

I read the BinaryNet paper and it seems very promising and suggests huge speed/power gain. I am currently using TensorFlow framework and Im not familiar with Torch at all. I was wondering if this great work can be converted to TF framework? is it possible to implement it in TF? Have you maybe done this?

Many thanks! Yonathan

Aug 21 '16 13:08 Jony101K

Hi Yonathan,

Sorry for the delay.

Sure, just follow the a training algorithm as appeared in the paper. You need to create a weights binarization function, replace the classic activation function (e.g. ReLU) in your model with hard-tanh and sign and don't forget to use batch normalization in your model. I am currently working on some extensions to this work but if I'll have some free time this week I'll create a TF version of BNN and publish it. In the meantime you can try running our Theano code: https://github.com/ MatthieuCourbariaux/BinaryNet

Best, Itay

On Sun, Aug 21, 2016 at 4:38 PM, Jony101K [email protected] wrote:

Hi Itay,

I read the BinaryNet paper and it seems very promising and suggests huge speed/power gain. I am currently using TensorFlow framework and Im not familiar with Torch at all. I was wondering if this great work can be converted to TF framework? is it possible to implement it in TF? Have you maybe done this?

Many thanks! Yonathan

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/itayhubara/BinaryNet/issues/9, or mute the thread https://github.com/notifications/unsubscribe-auth/AGcabCUrXWbO3BZi7k_u0UgJtQbbFjnzks5qiFS9gaJpZM4JpTJv .

Aug 31 '16 08:08 itayhubara

Hi Itay,

Thanks for your response!

I have already tried implementing it in TF:

I initialized all my variables with the following function:

        def weight_variable(shape):
            initial = tf.random_uniform(shape, -1.0, 1.0)
            return fw(tf.Variable(initial))

where fw() is my binarization function:

    def fw(x):
        with G.gradient_override_map({"Floor": "Identity"}):
           #from range [-1,1] -> to range [0,1]
            x = tf.clip_by_value((x+1.)/2.,0,1)
           #from range [0,1] -> to hard [0,1]
            x = tf.round(x)
           #from hard [0,1] -> to hard [-1,1]
            x = 2*x-1
            return x

and instead of using Relu, I used my activation function fa():

    def fa(x):
        with G.gradient_override_map({"Sign": "Identity"}):
            # from x -> to hard [-1,0,1]
            x = tf.sign(x)
            x = fw(x)
            return x

What do you think?... Did I miss anything in the weights/activation implementation? Would you do it differently?

Many Thanks! Yonathan

Aug 31 '16 09:08 Jony101K

Hi, Jony101K Can you succeed implementing this modle in TF? I think you at least need modify the grad computing about "tf.round" function. I'm very interested in implementing this model in TF，and I encounter some questions about batch normalization in TF. could you discuss with me this question or contact [email protected]

Nov 17 '16 10:11 axiaoyeah

Hi @Jony101K, Did you successfully implemented BNNs in TF?

Regards, Alexandre

Mar 16 '17 10:03 Alexivia

I have been trying to implement it by myself. Is there any progress on the topic? @Jony101K or @itayhubara or @Alexivia

Aug 08 '17 06:08 abhishektyaagi

Hi @abhishek42 Not that I know of... I had to implement some "pseudo-binarisation" functions myself, based on https://arxiv.org/abs/1606.06160

Aug 09 '17 19:08 Alexivia

Hi @Alexivia , the link is broken

Aug 09 '17 19:08 abhishektyaagi

just copy and paste into your search bar

Aug 09 '17 19:08 Alexivia

Got it. Also, on the same topic, I saw an issue started by you which was closed as someone suggested to ask it on stack overflow, but i think it is really helpful for BNNs. Can you share the question from stack overflow (if you went there for help) or did you use the method pointed out in the same thread?

Aug 10 '17 07:08 abhishektyaagi

I ended up using something similar to what gaohuazuo answered, but without the Defun class, just with a normal Python function.

Aug 10 '17 07:08 Alexivia

Is it possible for you to publish your implementation because I am working on a similar thing and have been stuck with it for some time

On Aug 10, 2017 1:23 PM, "Alexandre Vieira" [email protected] wrote:

I ended up using something similar to what gaohuazuo answered, but without the Defun class, just with a normal Python function.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/itayhubara/BinaryNet/issues/9#issuecomment-321478818, or mute the thread https://github.com/notifications/unsubscribe-auth/ALVdlMKUkNZCtFrl88TiZgOmWd9jSlJlks5sWrbwgaJpZM4JpTJv .

Aug 10 '17 08:08 abhishektyaagi

Is it possible for you to publish your implementation because I am working on a similar thing and have been stuck with it for some time …

Aug 10 '17 14:08 abhishektyaagi

I published my implementation in TF names https://github.com/itayhubara/BinaryNet.tf/. Note that this is an incomplete implementation as I didn't use shift-base BN/AdaMax and the Square Hinge loss. I'll add them as well as additional datasets support soon. Please let me know if you find any bug

Aug 22 '17 15:08 itayhubara

I've reimplemented it in TF2 (using tf.keras) for myself, but maybe somebody would also find a use for it. It also doesn't include shift-based BN/AdaMax, but is otherwise similar to the Theano version and supports all 3 datasets from the paper.

Apr 21 '20 09:04 vadimq