BinaryNet in TensorFlow ?...
Hi Itay,
I read the BinaryNet paper and it seems very promising and suggests huge speed/power gain. I am currently using TensorFlow framework and Im not familiar with Torch at all. I was wondering if this great work can be converted to TF framework? is it possible to implement it in TF? Have you maybe done this?
Many thanks! Yonathan
Hi Yonathan,
Sorry for the delay.
Sure, just follow the a training algorithm as appeared in the paper. You need to create a weights binarization function, replace the classic activation function (e.g. ReLU) in your model with hard-tanh and sign and don't forget to use batch normalization in your model. I am currently working on some extensions to this work but if I'll have some free time this week I'll create a TF version of BNN and publish it. In the meantime you can try running our Theano code: https://github.com/ MatthieuCourbariaux/BinaryNet
Best, Itay
On Sun, Aug 21, 2016 at 4:38 PM, Jony101K [email protected] wrote:
Hi Itay,
I read the BinaryNet paper and it seems very promising and suggests huge speed/power gain. I am currently using TensorFlow framework and Im not familiar with Torch at all. I was wondering if this great work can be converted to TF framework? is it possible to implement it in TF? Have you maybe done this?
Many thanks! Yonathan
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/itayhubara/BinaryNet/issues/9, or mute the thread https://github.com/notifications/unsubscribe-auth/AGcabCUrXWbO3BZi7k_u0UgJtQbbFjnzks5qiFS9gaJpZM4JpTJv .
Hi Itay,
Thanks for your response!
I have already tried implementing it in TF:
I initialized all my variables with the following function:
def weight_variable(shape):
initial = tf.random_uniform(shape, -1.0, 1.0)
return fw(tf.Variable(initial))
where fw() is my binarization function:
def fw(x):
with G.gradient_override_map({"Floor": "Identity"}):
#from range [-1,1] -> to range [0,1]
x = tf.clip_by_value((x+1.)/2.,0,1)
#from range [0,1] -> to hard [0,1]
x = tf.round(x)
#from hard [0,1] -> to hard [-1,1]
x = 2*x-1
return x
and instead of using Relu, I used my activation function fa():
def fa(x):
with G.gradient_override_map({"Sign": "Identity"}):
# from x -> to hard [-1,0,1]
x = tf.sign(x)
x = fw(x)
return x
What do you think?... Did I miss anything in the weights/activation implementation? Would you do it differently?
Many Thanks! Yonathan
Hi, Jony101K Can you succeed implementing this modle in TF? I think you at least need modify the grad computing about "tf.round" function. I'm very interested in implementing this model in TF,and I encounter some questions about batch normalization in TF. could you discuss with me this question or contact [email protected]
Hi @Jony101K, Did you successfully implemented BNNs in TF?
Regards, Alexandre
I have been trying to implement it by myself. Is there any progress on the topic? @Jony101K or @itayhubara or @Alexivia
Hi @abhishek42 Not that I know of... I had to implement some "pseudo-binarisation" functions myself, based on https://arxiv.org/abs/1606.06160
Hi @Alexivia , the link is broken
just copy and paste into your search bar
Got it. Also, on the same topic, I saw an issue started by you which was closed as someone suggested to ask it on stack overflow, but i think it is really helpful for BNNs. Can you share the question from stack overflow (if you went there for help) or did you use the method pointed out in the same thread?
I ended up using something similar to what gaohuazuo answered, but without the Defun class, just with a normal Python function.
Is it possible for you to publish your implementation because I am working on a similar thing and have been stuck with it for some time
On Aug 10, 2017 1:23 PM, "Alexandre Vieira" [email protected] wrote:
I ended up using something similar to what gaohuazuo answered, but without the Defun class, just with a normal Python function.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/itayhubara/BinaryNet/issues/9#issuecomment-321478818, or mute the thread https://github.com/notifications/unsubscribe-auth/ALVdlMKUkNZCtFrl88TiZgOmWd9jSlJlks5sWrbwgaJpZM4JpTJv .
Is it possible for you to publish your implementation because I am working on a similar thing and have been stuck with it for some time …
I published my implementation in TF names https://github.com/itayhubara/BinaryNet.tf/. Note that this is an incomplete implementation as I didn't use shift-base BN/AdaMax and the Square Hinge loss. I'll add them as well as additional datasets support soon. Please let me know if you find any bug
I've reimplemented it in TF2 (using tf.keras) for myself, but maybe somebody would also find a use for it. It also doesn't include shift-based BN/AdaMax, but is otherwise similar to the Theano version and supports all 3 datasets from the paper.