Get rid of ugly macro?
I was chatting with @adrhill and he suggested that the macro @primitive could be discarded if each backend simply implemented some methods from AbstractDifferentiation, mostly jacobian and a pushforward or pullback. Thoughts?
To elaborate on this:
What the macro does
As far as I understand, the @primitive macro is used on pullbacks/pushforwards from individual backends to generate the following AD.jacobian functions:
Forward-mode AD: https://github.com/JuliaDiff/AbstractDifferentiation.jl/blob/211b67528c5ed91971bb524c57adb63837163367/src/AbstractDifferentiation.jl#L600-L634
Reverse-mode AD: https://github.com/JuliaDiff/AbstractDifferentiation.jl/blob/211b67528c5ed91971bb524c57adb63837163367/src/AbstractDifferentiation.jl#L636-L663
These functions compute full Jacobians by evaluating the pullbacks/pushforwards on the standard basis (identity_like).
Fallback behavior
By default, the fallback jacobian function is empty (maybe this should be replaced by a NotImplementedError):
https://github.com/JuliaDiff/AbstractDifferentiation.jl/blob/211b67528c5ed91971bb524c57adb63837163367/src/AbstractDifferentiation.jl#L82
As shown in the implementer guide, this jacobian function is the fallback at the core of most functions exported by AbstractDifferentiation:
Taking reverse-mode AD as an example, the function dependency graph of value_and_pullback_function would look as follows:
-
value_and_pullback_functioncallsjacobian -
jacobianis an empty function
Now, when a reverse-mode AD backend is loaded, value_and_pullback_function is defined for the backend and @primitive is called on it, the function dependency graph is inverted:
-
value_and_pullback_functioncalls the backend - a new generated
jacobiancallsvalue_and_pullback_function
The second behaviour is desired, as we wouldn't want to compute a full Jacobian just to compute a VJP when we can instead evaluate the pullback directly.
The fact that the function dependency graph is flipped was very confusing to me at first. A lot of hidden control flow is added via package extensions and the @primitive macro, which currently isn't documented in the implementer guide.
Back to the question
Why is AD.jacobian so central to AbstractDifferentiation.jl and why does it have to be generated via a macro? Can't it be implemented in a more generic way by making sure pullbacks and pushforward wrappers have consistent output types?
The only advantage I currently see is to allow users to
- compute VJPs by constructing a full Jacobian using JVPs
- compute JVPs by constructing a full Jacobian using VJPs
but those sound like things that should usually be avoided.
Why isn't AbstractDifferentiation.jl built around two primitives value_and_pullback_function and value_and_pushforward [^1] and making more liberal use of dispatch on the AbstractReverseMode and AbstractForwardMode types?
[^1]: Ideally with in-place mutating variants.
Duplicate of https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/13, or at least https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/13#issuecomment-1664642912 and the following discussion?
Why is AD.jacobian so central to AbstractDifferentiation.jl Why isn't AbstractDifferentiation.jl built around two
primitives value_and_pullback_functionandvalue_and_pushforward
Historical reasons based mainly on the original author have a strong enough understanding of the calculus involved, but not such a strong understanding of autodiff or julia abstractions, IIRC. And the priority being on getting something out that worked and was usable. It should be.
This issue is my fault. Feel free to remove the macro if it makes things simpler.
BTW, regarding
Why is AD.jacobian so central to AbstractDifferentiation.jl and why does it have to be generated via a macro? Can't it be implemented in a more generic way by making sure pullbacks and pushforward wrappers have consistent output types?
https://github.com/JuliaDiff/AbstractDifferentiation.jl/pull/95 trimmed down the macro, it can only be used anymore to implement the jacobian based on a pushforward_function or a value_and_pullback_function. Support for ReverseDiff and FiniteDifferences is implemented without the macro already, and e.g. ForwardDiff uses the automatically constructed jacobian function only for functions with multiple arguments (the single-argument version just calls ForwardDiff.jacobian).
As I mentioned in https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/13#issuecomment-1994191014 and https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/123#issuecomment-1880412967, I am ok with removing the macro. It is currently a thin wrapper over a pushforward or pullback definition. Feel free to open a PR.