Need better documentation/examples/error messages for using HessianConfig with DiffResult API
The following works for gradient!():
using DiffBase, ReverseDiff
f(x) = sum(sin, x)+prod(tan, x)*sum(sqrt, x);
x = rand(4);
result = DiffBase.GradientResult(x);
rcfg = ReverseDiff.GradientConfig(x);
ReverseDiff.gradient!(result, f, x, rcfg);
DiffBase.value(result)
DiffBase.gradient(result)
However, the Hessian analogue of the above fails:
using DiffBase, ReverseDiff
f(x) = sum(sin, x)+prod(tan, x)*sum(sqrt, x);
x = rand(4);
result = DiffBase.HessianResult(x);
rcfg = ReverseDiff.HessianConfig(x);
ReverseDiff.hessian!(result, f, x, rcfg);
DiffBase.value(result)
DiffBase.gradient(result)
DiffBase.hessian(result)
Hessians are slightly different - if you are using a DiffResult, you need to pass it to the HessianConfig. For example, this should work:
using DiffBase, ReverseDiff
f(x) = sum(sin, x)+prod(tan, x)*sum(sqrt, x);
x = rand(4);
result = DiffBase.HessianResult(x);
rcfg = ReverseDiff.HessianConfig(result, x); # let the HessianConfig know you'll be using `result`
ReverseDiff.hessian!(result, f, x, rcfg);
DiffBase.value(result)
DiffBase.gradient(result)
DiffBase.hessian(result)
This can be found in the tests, but isn't documented very well at all, and we don't have example code for it. The error message is also horrible.
We should document this better and provide examples. This might also be a case where we could easily add an informative error message. Alternatively, we might actually be able to allow HessianConfig(x) to work for the second example, though it would be a less efficient way to use the configuration API.
I'll keep this issue open, but rename it to reflect the actual problem (poor docs + examples + error message).
Thanks a lot @jrevels, I understand the API for Hessians better via your clarifications. I have one more question; it is highly likely I do sth wrong in what follows. When I try to use a tape in the above example, it then breaks:
using DiffBase, ReverseDiff
f(x) = sum(sin, x)+prod(tan, x)*sum(sqrt, x);
x = rand(4);
result = DiffBase.HessianResult(x);
rcfg = ReverseDiff.HessianConfig(result, x); # let the HessianConfig know you'll be using `result`
ftape = ReverseDiff.HessianTape(f, x);
cftape = ReverseDiff.compile(ftape);
ReverseDiff.hessian!(result, cftape, x, rcfg);
DiffBase.value(result)
DiffBase.gradient(result)
DiffBase.hessian(result)
ReverseDiff.hessian!(result, cftape, x, rcfg) should just be ReverseDiff.hessian!(result, cftape, x) (this is probably another thing that could be better documented/throw better errors).
Thanks @jrevels, I now follow the API.