siren icon indicating copy to clipboard operation
siren copied to clipboard

some about first_layer_sine_init

Open lingtengqiu opened this issue 4 years ago • 1 comments

I am confused about first_layer_sine_init, where you set W~uniform(-1/n,1/n).

As we know, input is X-uniform(-1,1) so has var[x] = (2^2/12)=1/3. and after FC layer, var[sin(30Wx+b)] = 30^2n*(1/3)*(c^2/3) =1? so how you initialize first-layer-weight by uniform(-1/n,1/n)?

lingtengqiu avatar Apr 27 '21 07:04 lingtengqiu

the complete logic is:

  1. if the input is only x, so 1D, then the input layer should be W~uniform (-1,1).
  2. however, if you do that, you will find that after first layer (after activation with sine), you DON'T get the U-shaped beta distribution.
  3. what happened? the reason is that two i.i.d (-1,1) uniform product will not trigger the sine function to get the complete U-shaped beta distribution
  4. the easy remedy is introducing w_0 as 30. so that before the activation, these linear outputs are varying so much so that sine will fully activate them

as for your derivation, I don't get how can you get "var[sin(30Wx+b)]"... note that there is a sine... normally, you cannot directly get the variance computed unless you know it is an arcsine distribution. and note that 30 plays a role in make that arcsine as well..

pswpswpsw avatar Jun 09 '21 22:06 pswpswpsw