ReverseDiff.jl icon indicating copy to clipboard operation
ReverseDiff.jl copied to clipboard

Need better documentation/examples/error messages for using HessianConfig with DiffResult API

Open papamarkou opened this issue 8 years ago • 4 comments

The following works for gradient!():

using DiffBase, ReverseDiff

f(x) = sum(sin, x)+prod(tan, x)*sum(sqrt, x);

x = rand(4);

result = DiffBase.GradientResult(x);

rcfg = ReverseDiff.GradientConfig(x);

ReverseDiff.gradient!(result, f, x, rcfg);

DiffBase.value(result)

DiffBase.gradient(result)

However, the Hessian analogue of the above fails:

using DiffBase, ReverseDiff

f(x) = sum(sin, x)+prod(tan, x)*sum(sqrt, x);

x = rand(4);

result = DiffBase.HessianResult(x);

rcfg = ReverseDiff.HessianConfig(x);

ReverseDiff.hessian!(result, f, x, rcfg);

DiffBase.value(result)

DiffBase.gradient(result)

DiffBase.hessian(result)

papamarkou avatar Mar 25 '17 16:03 papamarkou

Hessians are slightly different - if you are using a DiffResult, you need to pass it to the HessianConfig. For example, this should work:

using DiffBase, ReverseDiff
f(x) = sum(sin, x)+prod(tan, x)*sum(sqrt, x);
x = rand(4);
result = DiffBase.HessianResult(x);
rcfg = ReverseDiff.HessianConfig(result, x); # let the HessianConfig know you'll be using `result`
ReverseDiff.hessian!(result, f, x, rcfg);
DiffBase.value(result)
DiffBase.gradient(result)
DiffBase.hessian(result)

This can be found in the tests, but isn't documented very well at all, and we don't have example code for it. The error message is also horrible.

We should document this better and provide examples. This might also be a case where we could easily add an informative error message. Alternatively, we might actually be able to allow HessianConfig(x) to work for the second example, though it would be a less efficient way to use the configuration API.

I'll keep this issue open, but rename it to reflect the actual problem (poor docs + examples + error message).

jrevels avatar Mar 28 '17 14:03 jrevels

Thanks a lot @jrevels, I understand the API for Hessians better via your clarifications. I have one more question; it is highly likely I do sth wrong in what follows. When I try to use a tape in the above example, it then breaks:

using DiffBase, ReverseDiff
f(x) = sum(sin, x)+prod(tan, x)*sum(sqrt, x);
x = rand(4);
result = DiffBase.HessianResult(x);
rcfg = ReverseDiff.HessianConfig(result, x); # let the HessianConfig know you'll be using `result`
ftape = ReverseDiff.HessianTape(f, x);
cftape = ReverseDiff.compile(ftape);
ReverseDiff.hessian!(result, cftape, x, rcfg);
DiffBase.value(result)
DiffBase.gradient(result)
DiffBase.hessian(result)

papamarkou avatar Mar 28 '17 20:03 papamarkou

ReverseDiff.hessian!(result, cftape, x, rcfg) should just be ReverseDiff.hessian!(result, cftape, x) (this is probably another thing that could be better documented/throw better errors).

jrevels avatar Mar 29 '17 14:03 jrevels

Thanks @jrevels, I now follow the API.

papamarkou avatar Mar 29 '17 14:03 papamarkou