siren
siren copied to clipboard
some about first_layer_sine_init
I am confused about first_layer_sine_init, where you set W~uniform(-1/n,1/n).
As we know, input is X-uniform(-1,1) so has var[x] = (2^2/12)=1/3. and after FC layer, var[sin(30Wx+b)] = 30^2n*(1/3)*(c^2/3) =1? so how you initialize first-layer-weight by uniform(-1/n,1/n)?
the complete logic is:
- if the input is only x, so 1D, then the input layer should be W~uniform (-1,1).
- however, if you do that, you will find that after first layer (after activation with sine), you DON'T get the U-shaped beta distribution.
- what happened? the reason is that two i.i.d (-1,1) uniform product will not trigger the sine function to get the complete U-shaped beta distribution
- the easy remedy is introducing w_0 as 30. so that before the activation, these linear outputs are varying so much so that sine will fully activate them
as for your derivation, I don't get how can you get "var[sin(30Wx+b)]"... note that there is a sine... normally, you cannot directly get the variance computed unless you know it is an arcsine distribution. and note that 30 plays a role in make that arcsine as well..