DiffRules.jl icon indicating copy to clipboard operation
DiffRules.jl copied to clipboard

Derivatives of max and min

Open amrods opened this issue 4 years ago • 2 comments

Should we enforce that the derivatives of max(x, y) and min(x, y) are undefined when x == y?

amrods avatar Oct 21 '21 03:10 amrods

I think the more useful result is to pick a something like a sub-gradient. The use case to imagine is some optimisation problem, where equality might be the best or worst point. We'd like gradient descent to be able to end up there, or not to get stuck there, both of which want a finite gradient.

You could argue that these should pick a symmetric convention, though:

julia> ForwardDiff.gradient(x -> max(x...), [1,1])
2-element Vector{Int64}:
 0
 1

julia> ForwardDiff.derivative(x -> clamp(x, 1, 2), 1)
1

julia> ForwardDiff.derivative(x -> clamp(x, 1, 2), 2)
1

mcabbott avatar Oct 23 '21 13:10 mcabbott

Yes, the problem I had was encountered when optimizing a function with max(x, y) inside it. I was checking the gradients using ForwardDiff and realized that problem of no symmetry. Worse, though, is FiniteDiff:

julia> FiniteDiff.finite_difference_gradient(x -> max(x...), [1.0, 1.0])
2-element Vector{Float64}:
 0.4999999999934427
 0.4999999999934427

amrods avatar Nov 01 '21 23:11 amrods