Breaking changes for v2
I'm raising an issue to list the possible breaking changes for a future release of a v2 library:
- Removing
oraclekeyword argument and code in the twolegacy.jlfiles - Adding the slack $\epsilon$ in the signature of $J_E$ function (to support non-quadratic weighting)
- Consider changing the default arguments of
UnscentedKalmanFilter - Consider changing the default value of
transcriptioninNonLinMPCtoMultilpleShooting
About the new exact Hessian feature, this could be a breaking change:
- Consider changing the default value of
hessianinNonLinMPCandMovingHorizonEstimatortotrue(or perhapshessian=oracle, since it is only supported iforacle==true).
If I compare the benchmarks of our arXiv paper with Ipopt.jl v1.13.0 (that include a performance improvement for modifying the constant vector in linear constraints when exact Hessian is activated) and ModelPredictiveControl.jl v1.13.2, on the inverted pendulum:
# ====== transcription=SingleShooting(), hessian=false =============
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(374.487 ms)
btime_NMPC_regul_solver_IP = median(bm) = TrialEstimate(327.766 ms)
btime_EMPC_track_solver_IP = median(bm) = TrialEstimate(364.620 ms)
btime_EMPC_regul_solver_IP = median(bm) = TrialEstimate(363.225 ms)
# ======= transcription=SingleShooting(), hessian=true =============
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(133.406 ms)
btime_NMPC_regul_solver_IP = median(bm) = TrialEstimate(119.391 ms)
btime_EMPC_track_solver_IP = median(bm) = TrialEstimate(136.490 ms)
btime_EMPC_regul_solver_IP = median(bm) = TrialEstimate(119.599 ms)
# ====== transcription=MultipleShooting(), hessian=false ============
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(251.237 ms)
btime_NMPC_regul_solver_IP = median(bm) = TrialEstimate(214.630 ms)
btime_EMPC_track_solver_IP = median(bm) = TrialEstimate(274.271 ms)
btime_EMPC_regul_solver_IP = median(bm) = TrialEstimate(318.969 ms)
# ====== transcription=MultipleShooting(), hessian=true ============
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(334.286 ms)
btime_NMPC_regul_solver_IP = median(bm) = TrialEstimate(269.657 ms)
btime_EMPC_track_solver_IP = median(bm) = TrialEstimate(951.969 ms)
btime_EMPC_regul_solver_IP = median(bm) = TrialEstimate(935.775 ms)
The fastest approach seems to be SingleShooting() and hessian=true. But sparse Hessian computation performances is highly problem-dependent. Actually, the Hessian of the economic MPC with multiple shooting seems to be weirdly more expensive.
In conclusion, I'm not totally convinced that changing the default of these two keyword arguments is worth it.
What's your opinion @baggepinnen ?
I don't have a strong opinion either way. The pendulum example may be too small to serve as the only deciding example, it's quite fast in either case, what happens to a much larger model? If MS+hessian makes a larger model take infinitely long to compile or otherwise, it may not be the best default.
Yep, I'll verify if it's really the Hessian that take this much time. ⌛️
-
optimdefault toModel(() -> UnoSolver.Optimizer(preset="filtersqp"));?
I expect filtersqp to be faster in most cases, but I also expect it to be less robust when the initial guess for the first optimization is poor
That's also my feeling but I did not hit cases in which the poor initial guesses led to divergence. I think MATLAB default to interior point for nlmpc in the past, but now it's an SQP algorithm by default.
Yep, I'll verify if it's really the Hessian that take this much time. ⌛️
Yep, all the the time is spent in the sparse Hessian computation of the objective function.
The number of colors is way higher than all the other cases:
ncolors(∇²J_prep) = 39
62×62 SparseArrays.SparseMatrixCSC{Bool, Int64} with 294 stored entries:
⎡⡿⣯⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠁⎤
⎢⡇⠀⠱⣦⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⡇⠀⠀⠀⠱⣦⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⡇⠀⠀⠀⠀⠀⠱⣦⡀⠀⠀⠀⠀⠀⎥
⎢⡇⠀⠀⠀⠀⠀⠀⠈⠱⣦⡀⠀⠀⠀⎥
⎢⡇⠀⠀⠀⠀⠀⠀⠀⠀⠈⠱⣦⡀⠀⎥
⎣⠇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠱⣦⎦
I will experiment with other coloring orders...
The hessian has an arrow shape? IIRC @gdalle and friends have some special trick with mixed-mode AD that can handle such cases?
Wait, how can the hessian have an arrow shape with multiple shooting? I tend to get block-diagonal hessians in this case. The arrow shape indicates that there is some variable that interacts with all other variables, is this a minimum-time problem or something?
The hessian has an arrow shape? IIRC @gdalle and friends have some special trick with mixed-mode AD that can handle such cases?
Mixed mode is only for Jacobians, for which you can compute rows with reverse mode or columns with forward mode. By symmetry, computing a row of the Hessian is the same as computing a column.
Here are a few things that can help:
- trying out different coloring orders (or trying them all)
- switching to a substitution-based decompression
- postprocessing won't help here since the diagonal seems full
In order words, I'd recommend going from
GreedyColoringAlgorithm()
to
GreedyColoringAlgorithm(
(
NaturalOrder(),
LargestFirst(),
SmallestLast(),
IncidenceDegree(),
DynamicLargestFirst(),
RandomOrder(StableRNG(0), 0)
),
decompression=:substitution
)
to see if it does anything.
Oh wow, the RandomOrder(StableRNG(0), 0) does the trick: 4 colors! Right now my default sparse Jacobian (edit: and Hessian) backend is:
AutoSparse(
AutoForwardDiff();
sparsity_detector = TracerSparsityDetector(),
coloring_algorithm = GreedyColoringAlgorithm(
(
NaturalOrder(),
LargestFirst(),
SmallestLast(),
IncidenceDegree(),
DynamicLargestFirst()
),
postprocessing = true
)
)
that is, I did not included RandomOrder in the tuple. I was scared of including this by default. In my mind it could lead to weird non-deterministic performance issues. I'm also trying to avoid new dependencies in the package if they are not strictly necessary. Two questions on RandomOrder:
- If I include it in the tuple above, id there a risk that someone has e.g. a low amount of color one day, and a high amount of color the day after. I'm not sure I understand how the pseudo-random generator is used in this algorithm.
- What's the pros and cons of using ranges from
Randominstead ofStableRNGs(to avoid new dependencies) ?
Wait, how can the hessian have an arrow shape with multiple shooting? I tend to get block-diagonal hessians in this case. The arrow shape indicates that there is some variable that interacts with all other variables, is this a minimum-time problem or something?
The objective function is the sum of the tracking term, the move suppression term and the energy consumption term, as described here. There is no slacking variable $\epsilon$ since Cwt=Inf. The energy consumption term is the product of a state with the manipulated input. I'm still not sure why it gives an arrow shape right now...
edit: So the arrow shape is related to the (non-quadratic) energy consumption term, since if I do Ewt=0, I get:
ncolors(∇²J_prep) = 2
62×62 SparseArrays.SparseMatrixCSC{Bool, Int64} with 142 stored entries:
⎡⠱⣦⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠱⣦⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠱⣦⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠱⣦⡀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠈⠱⣦⡀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠱⣦⡀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠱⣦⎦
edit2: it may be related to the fact that the decision variables are the manipulated input increments over the control horizon and the states over the prediction horizon.
Note that postprocessing doesn't do anything for unidirectional Jacobians. Like the decompression by substitution, it only applies to Hessians or bidirectional Jacobians.
The API of RandomOrder is a bit clumsy: first you give the generator, then you give the seed to reset it with before each coloring. If you don't give that seed, you will see different numbers of colors at every run of the preparation. So as long as you provide that seed, results are reproducible... up to Julia version. Indeed, Random (the stdlib) doesn't guarantee stability of the exact stream of random numbers produced from one Julia version to the next. Hence my recommendation to use StableRNGs instead for full reproducibility.
As a side note, the zero inside StableRNG(0) doesn't matter because the generator will be re-seeded anyway.
If you're curious you can look at the code, it is dead simple: https://github.com/gdalle/SparseMatrixColorings.jl/blob/main/src/order.jl#L57-L64
Note that postprocessing doesn't do anything for unidirectional Jacobians. Like the decompression by substitution, it only applies to Hessians or bidirectional Jacobians.
Okay, I just used the same default sparse backend for the Jacbians and for the Hessians to avoid code duplication (i.e. don't repeat yourself). I understand that these options will be just ignored for the Jacobians, which is okay with me.
The API of RandomOrder is a bit clumsy: first you give the generator, then you give the seed to reset it with before each coloring. If you don't give that seed, you will see different numbers of colors at every run of the preparation. So as long as you provide that seed, results are reproducible... up to Julia version. Indeed, Random (the stdlib) doesn't guarantee stability of the exact stream of random numbers produced from one Julia version to the next. Hence my recommendation to use StableRNGs instead for full reproducibility. As a side note, the zero inside StableRNG(0) doesn't matter because the generator will be re-seeded anyway. If you're curious you can look at the code, it is dead simple: https://github.com/gdalle/SparseMatrixColorings.jl/blob/main/src/order.jl#L57-L64
Alright. 190 lines of code for StableRNGs: it seems very lightweight. I will add RandomOrder in the tuple and the StableRNGs dependency.
Okay, I just used the same default sparse backend for the Jacbians and for the Hessians to avoid code duplication (i.e. don't repeat yourself). I understand that these options will be just ignored for the Jacobians, which is okay with me.
Perfect.
Alright. 190 lines of code for StableRNGs: it seems very lightweight. I will add RandomOrder in the tuple and the StableRNGs dependency.
I would assume it's a very stable library anyway.
Does it change the running time significantly for some instances (not just the number of colors)?
Does it change the running time significantly for some instances (not just the number of colors)?
Yeah it's about 2.4 times faster, once more:

This new coloring order seems really beneficial for more complex nonlinear objective like here (minimizing the energy consumption).
If I compare the benchmarks of our arXiv paper with
Ipopt.jl v1.13.0(that include a performance improvement for modifying the constant vector in linear constraints when exact Hessian is activated) andModelPredictiveControl.jl v1.13.2, on the inverted pendulum:
So, here's the benchmarks with the last update that include the RandomOrder in the tuple by default. I only re-executed transcription=SingleShooting(), hessian=true and transcription=MultipleShooting(), hessian=true case studies:
# ====== transcription=SingleShooting(), hessian=false =============
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(374.487 ms)
btime_NMPC_regul_solver_IP = median(bm) = TrialEstimate(327.766 ms)
btime_EMPC_track_solver_IP = median(bm) = TrialEstimate(364.620 ms)
btime_EMPC_regul_solver_IP = median(bm) = TrialEstimate(363.225 ms)
# ====== transcription=SingleShooting(), hessian=true ============
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(147.681 ms)
btime_NMPC_regul_solver_IP = median(bm) = TrialEstimate(122.707 ms)
btime_EMPC_track_solver_IP = median(bm) = TrialEstimate(139.279 ms)
btime_EMPC_regul_solver_IP = median(bm) = TrialEstimate(154.855 ms)
# ====== transcription=MultipleShooting(), hessian=false ============
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(251.237 ms)
btime_NMPC_regul_solver_IP = median(bm) = TrialEstimate(214.630 ms)
btime_EMPC_track_solver_IP = median(bm) = TrialEstimate(274.271 ms)
btime_EMPC_regul_solver_IP = median(bm) = TrialEstimate(318.969 ms)
# ====== transcription=MultipleShooting(), hessian=true ============
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(347.680 ms)
btime_NMPC_regul_solver_IP = median(bm) = TrialEstimate(296.589 ms)
btime_EMPC_track_solver_IP = median(bm) = TrialEstimate(400.592 ms)
btime_EMPC_regul_solver_IP = median(bm) = TrialEstimate(342.868 ms)