Investigate splitting factor operational part from 'constant' part of factor
Why?
- Simplify serialization, pack/unpack factor only depends on the factor and does not need the full factor graph
- Simplify PackedFunctionNodeData{T}, GenericFunctionNodeData{T} where T <: Union{<:AbstractPackedFactor, <:AbstractFactor, <:FactorOperationalMemory}
Questions to be answered If it can be split so that factor solver data only contains the working memory serialization becomes easy as solver data is just ignored and rehydrated when unpacked for use by the solver and there can possibly be no PackedFactor.
- What fields are always constant?
- DF: Almost everything in
FactorNodeDatais constant and can move to FactorFDG -- but, need to checking howCCWis built uponFactorCompute
- DF: Almost everything in
- Will it impact more complicated factor types?
No, this should not touch the current
usrfunfield functionallity. - Should we build factor data (w.r.t. the compute part FactorOperationalMemory/CCW) lazily from DFG and only "inflate" when used in IIF. That is, move
reconstFactorDataout of DFG?- JT I believe so, that way we don't need a fg to unpack a factor and don't need to calculate ccw if never used (eg plotting).
'constant' in this case means not changed by the solver
In other words: Should we move non FactorOperationalMemory to DFGFactor: Maybe: fnc, multihypo, nullhypo, inflation ? that way we split solverData <: FactorOperationalMemory and constants
Note on CCW: currently, only hyporecipe.certainhypo and usrfnc! are pulled out of CCW to be persisted, the rest are thrown away.
Also code TODO: improve _createCCW for hypotheses and certainhypo field recovery when deserializing reconstitute from stored data
Factor Field Comparison
| Field | FactorDFG |
FactorCompute |
|---|---|---|
id |
Union{UUID, Nothing} |
Union{UUID, Nothing} |
label |
Symbol |
Symbol |
tags |
Vector{Symbol} |
Set{Symbol} |
_variableOrderSymbols |
Vector{Symbol} |
NTuple{N, Symbol} |
timestamp |
ZonedDateTime |
ZonedDateTime |
nstime |
String |
Nanosecond |
solvable |
Int |
Base.RefValue{Int} |
metadata |
String |
smallData::Dict{Symbol, SmallDataTypes} |
solverData |
data::String |
Base.RefValue{GenericFunctionNodeData{T}} |
fnctype |
String |
Not present |
_version |
String |
Not present |
@kwdef struct CommonConvWrapper{T <: AbstractFactor, VT <: Tuple, TP <: Base.RefValue{<:Tuple}, CT, AM <: AbstractManifold, HR <: HypoRecipeCompute, MT, G} <: FactorOperationalMemory
usrfnc::T
fullvariables::VT
varValsAll::TP
dummyCache::CT = nothing
manifold::AM = getManifold(usrfnc!)
partialDims::Vector{<:Integer} = collect(1:manifold_dimension(manifold))
partial::Bool = false
nullhypo::Float64 = 0.0
inflation::Float64 = SolverParams().inflation
hyporecipe::HR = HypoRecipeCompute(;activehypo=collect(1:length(varValsAll)))
measurement::Vector{MT} = Vector(Vector{Float64}())
varidx::Base.RefValue{Int} = Ref(1)
particleidx::Base.RefValue{Int} = Ref(1)
res::Vector{Float64} = zeros(manifold_dimension(manifold))
_gradients::G = nothing
end
@kwdef mutable struct GenericFunctionNodeData{T <: Union{<:AbstractPackedFactor, <:AbstractFactor, <:FactorOperationalMemory},}
eliminated::Bool = false
potentialused::Bool = false
edgeIDs::Vector{Int} = Int[]
fnc::T
multihypo::Vector{Float64} = Float64[]
certainhypo::Vector{Int} = Int[]
nullhypo::Float64 = 0.0
solveInProgress::Int = 0
inflation::Float64 = 0.0
end
Overlap: fnc (almost), nullhypo and inflation
It looks like we can look at main differences as:
| Field | FactorDFG |
FactorCompute |
|---|---|---|
observationFnc |
Serialized JSONObject | T <: ObservationFunction |
computeMemory |
Not Serialized | C <: FactorOperationalMemory* |
- *I think
FactorOperationalMemoryshould be lazily initialized
Also rename to _State as part of this refactor. #1121
Does Observation live next to or under State.
FactorCompute
- state
- observation
- computeMemory
or
FactorCompute
- state
- observation
- computeMemory
Note: some state and observation fields may be copied to computeMemory as one CCW object.
I went through the code and the only thing that might not be ideal is we may need something like computeMem::Base.RefValue{FactorOperationalMemory}
Advantages:
- unblocks #1077
- unblocks IIFTypes and RoMETypes subprojects
- We can do lazy inflation of OperationalMemory (CCW)
- Simplify serialization, pack/unpack factor only depends on the factor and does not need the full factor graph
- Simplify PackedFunctionNodeData{T}, GenericFunctionNodeData{T} where T <: Union{<:AbstractPackedFactor, <:AbstractFactor, <:FactorOperationalMemory}
The disadvantage is we still have dynamic dispatch or we rebuild the entire FactorCompute when ccw is needed.
- There will still be dynamic dispatch on ccw, FactorOperationalMemory is an abstract type.
- We can maybe just rebuild the entire factor once the type is known. In that case we can have Nothing if it is not yet inflated. It will then be similar to the current state with the type still not know to the compiler.
Does Observation live next to or under State.
Do next to. User only creates the observation, computer creates the state.
DUPLICATION AT: #1119
- [ ] add new field to FactorDFG.observFnc (side by side with old .data / .fnctype)
From: #1118
I think we can probably close this and only discuss in the issues #1105 and [992 here] to avoid duplication.
As a general philosophy, the following is bad and we should not have this anywhere:
JSON.write({"label": JSON.write("namething")})❌
Please elaborate why it is so bad ~~in metadata and~~ FactorDFG.observation~~data~~? I don't see a way around it. The options there are currently "base64 encoded JSON string" or a "JSON string"
Agree -- Unavoidable for any user defined factor. See below for list of strange serialization that have already happened in the past. We must not restrict user factor definitions to adhere to JSON types as nested structures. The lazy user would just store a factor as e.g. factor.observation = {"payload": __b46str__, "content": "application/octet-stream"}. Basically that is what I do all the time, and have seen it around in may other places. So we are just conforming to this as a norm.
Existing and Strange user factor serializations (the fun stuff)
List of more complicated factor type that need to be unaffected (tested downstream) such as (i.e. not in Distributions.jl):
- [ ] IMU: https://github.com/JuliaRobotics/RoME.jl/blob/ca98e9c18f8837d92863cb798eb7f1ec5adcbbfc/src/factors/Inertial/IMUDeltaFactor.jl#L540-L568
- [ ] LIDAR: https://github.com/NavAbility/NavAbilitySDK.jl/blob/4bddd83fd1d66cf88077f9a69871df3b4530621d/test/unit/testFactors.jl#L61-L68
- Notes: serialization for that ScatterAlign factor uses preambleCache and works for either parched or unparched serialization modes. https://github.com/JuliaRobotics/Caesar.jl/blob/develop/test/testStashing_SAP.jl
- [ ] APRILTAGS: https://github.com/JuliaRobotics/Caesar.jl/blob/develop/test/testPose2AprilTag4Corner.jl
- [ ] SASBearing2D: https://github.com/JuliaRobotics/Caesar.jl/blob/9df2df54d4a1bc271afafd5730d35c95af4c67f5/src/beamforming/SASBearing2D.jl#L219-L313
- [x] FLUX.jl NNFactors: https://github.com/JuliaRobotics/IncrementalInference.jl/blob/dab3ca94bfb937629aa3e2751d3c5bf2f84d75fe/test/testFluxModelsDistribution.jl#L66-L68
- [ ] DEFactors (no serialization yet): https://github.com/JuliaRobotics/IncrementalInference.jl/blob/develop/test/testDERelative.jl
I went through the code and the only thing that might not be ideal is we may need something like computeMem::Base.RefValue{FactorOperationalMemory}
Sure, just do it and we can revise again later. The advantages are basically all consolidation points. very happy to see this more consolidated, if we have to do it through Ref then so be it. Adding decision label to help unblock.
*I think FactorOperationalMemory should be lazily initialized
Sure, I think that should be fine, at least in the medium term. Think one day this will likely show as a performance issue but we are still a ways off at this time. Will improve and refactor for performance later.
Once upon a time we needed to expand the old unpack during deserialization functions. It became clear that a FactorOperationalMemory would be needed. So I worked under the assumption it only happens on creation of a factor or when a factor is deserialized. I remember being bit worried about this or that corner case.
In other words: Should we move non FactorOperationalMemory to DFGFactor: Maybe: fnc, multihypo, nullhypo, inflation ? that way we split solverData <: FactorOperationalMemory and constants
This all these are addressed. hi @Affie , you should be unblocked on this issue now?
On observation serialization
- Must have a user provided pack and unpack
- The packed observation must be
JSON3.writeable
Update on https://github.com/JuliaRobotics/DistributedFactorGraphs.jl/issues/992#issuecomment-2787285338
Here’s a summary of your observation serialization requirements:
-
Default Serialization:
Observations useJSON.jsonandJSON.parsefor round-tripping, with defaultpack(x) = xandunpack(x) = x. -
Custom Serialization:
If round-tripping fails, users must provide custompackandunpackmethods, or extendStructUtils.lowerandStructUtils.lift.- The packed observation must have a unique, concrete type (not a generic
StringorDict) to ensure correct dispatch forunpack, avoiding type piracy.
- The packed observation must have a unique, concrete type (not a generic