owl
owl copied to clipboard
softmax is not twice-differentiable
The softmax operation in Algodiff is not twice-differentiable.
and softmax ?(axis = -1) x =
let c = Arr A.(max ~axis (unpack_arr x)) in
let y = exp (x - c) in
let a = sum ~axis y in
y / a
The current implementation involves a call to unpack_arr, which cannot be differentiated.