Suggestions for graph.info
My graph.info file appears as follows:
(AnalysisFactor0*AnalysisFactor1*AnalysisFactor2*PriorFactor0*PriorFactor1*PriorFactor2*PriorFactor3*PriorFactor4*PriorFactor5*PriorFactor6)
AnalysisFactor0
gaussian
centre GaussianPrior, mean = 50.0, sigma = 30.0
normalization GaussianPrior, mean = 3.0, sigma = 5.0
sigma GaussianPrior, mean = 10.0, sigma = 10.0
AnalysisFactor1
gaussian
centre GaussianPrior, mean = 50.0, sigma = 30.0
normalization GaussianPrior, mean = 3.0, sigma = 5.0
sigma GaussianPrior, mean = 10.0, sigma = 10.0
AnalysisFactor2
gaussian
centre GaussianPrior, mean = 50.0, sigma = 30.0
normalization GaussianPrior, mean = 3.0, sigma = 5.0
sigma GaussianPrior, mean = 10.0, sigma = 10.0
Factor(PriorFactor0, x=GaussianPrior, mean = 50.0, sigma = 30.0)
Factor(PriorFactor1, x=GaussianPrior, mean = 3.0, sigma = 5.0)
Factor(PriorFactor2, x=GaussianPrior, mean = 10.0, sigma = 10.0)
Factor(PriorFactor3, x=GaussianPrior, mean = 3.0, sigma = 5.0)
Factor(PriorFactor4, x=GaussianPrior, mean = 10.0, sigma = 10.0)
Factor(PriorFactor5, x=GaussianPrior, mean = 3.0, sigma = 5.0)
Factor(PriorFactor6, x=GaussianPrior, mean = 10.0, sigma = 10.0)
Does this line (AnalysisFactor0*AnalysisFactor1*AnalysisFactor2*PriorFactor0*PriorFactor1*PriorFactor2*PriorFactor3*PriorFactor4*PriorFactor5*PriorFactor6) offer any useful information?
I think it does, as one might have a lot of analysis factors and prior factors which it summarizes. But perhaps we could write it as follows:
AnalysisFactors:
AnalysisFactor0
AnalysisFactor1
AnalysisFactor2
Factors:
centre Factor(PriorFactor0, x=GaussianPrior, mean = 50.0, sigma = 30.0)
normalization Factor(PriorFactor1, x=GaussianPrior, mean = 3.0, sigma = 5.0)
sigma Factor(PriorFactor2, x=GaussianPrior, mean = 10.0, sigma = 10.0)
normalization Factor(PriorFactor3, x=GaussianPrior, mean = 3.0, sigma = 5.0)
sigma Factor(PriorFactor4, x=GaussianPrior, mean = 10.0, sigma = 10.0)
normalization Factor(PriorFactor5, x=GaussianPrior, mean = 3.0, sigma = 5.0)
sigma Factor(PriorFactor6, x=GaussianPrior, mean = 10.0, sigma = 10.0)
Does the word Factor in front of these tell us anything? Could we reduce this too:
Factors:
centre PriorFactor0, x=GaussianPrior, mean = 50.0, sigma = 30.0
normalization PriorFactor1, x=GaussianPrior, mean = 3.0, sigma = 5.0
sigma PriorFactor2, x=GaussianPrior, mean = 10.0, sigma = 10.0
normalization PriorFactor3, x=GaussianPrior, mean = 3.0, sigma = 5.0
sigma PriorFactor4, x=GaussianPrior, mean = 10.0, sigma = 10.0
normalization PriorFactor5, x=GaussianPrior, mean = 3.0, sigma = 5.0
sigma PriorFactor6, x=GaussianPrior, mean = 10.0, sigma = 10.0
In fact, I really like the list of each indivudual AnalysisFactor, which coverts the priors, so how about:
Factors:
centre **PriorFactor0 --> [AnalysisFactor0, AnalysisFactor1, AnalysisFactor2]**
normalization PriorFactor1 --> AnalysisFactor0
sigma PriorFactor2 --> AnalysisFactor0
normalization PriorFactor3 --> AnalysisFactor1
sigma PriorFactor4 --> AnalysisFactor1
normalization PriorFactor5 --> AnalysisFactor2
sigma PriorFactor6 --> AnalysisFactor2
AnalysisFactors:
AnalysisFactor0
gaussian
centre (PriorFactor0) GaussianPrior, mean = 50.0, sigma = 30.0
normalization (PriorFactor1) GaussianPrior, mean = 3.0, sigma = 5.0
sigma (PriorFactor2) GaussianPrior, mean = 10.0, sigma = 10.0
AnalysisFactor1
gaussian
centre (PriorFactor0) GaussianPrior, mean = 50.0, sigma = 30.0
normalization (PriorFactor3) GaussianPrior, mean = 3.0, sigma = 5.0
sigma (PriorFactor4) GaussianPrior, mean = 10.0, sigma = 10.0
AnalysisFactor2
gaussian
centre (PriorFactor0) GaussianPrior, mean = 50.0, sigma = 30.0
normalization (PriorFactor5) GaussianPrior, mean = 3.0, sigma = 5.0
sigma (PriorFactor6) GaussianPrior, mean = 10.0, sigma = 10.0
In bold I marked a potential pitfall, whereby for many AnalysisFactors the file will blow up. So maybe after a certain number we put something more generic like PriorFactor0 --> x13 AnalysisFactor.
This file was made for a dataset with 3 pieces of data. If we scaled this up to have 500+s individual datasets (each with ~3 model parameters) the file would blow up again. I guess we need to think about whether the declarative framework can exploit symmetries in the graph (if they are present) to reduce the information to something really concise, like:
Factors:
centre **PriorFactor0 -->x500 AnalysisFactor**
normalization x500 PriorFactor
sigma x500 PriorFactor
AnalysisFactors:
AnalysisFactor (x500)
gaussian
centre (PriorFactor0) GaussianPrior, mean = 50.0, sigma = 30.0
normalization (x500 PriorFactor) GaussianPrior, mean = 3.0, sigma = 5.0
sigma (x500 PriorFactor) GaussianPrior, mean = 10.0, sigma = 10.0
Obviously, the template above is crap and we need to think more carefully about this. Feels like th topology problem we had when we were discussing visualizing these things.
So (AnalysisFactor0AnalysisFactor1AnalysisFactor2PriorFactor0PriorFactor1PriorFactor2PriorFactor3PriorFactor4PriorFactor5*PriorFactor6)
and Factor(PriorFactor0, x=GaussianPrior, mean = 50.0, sigma = 30.0)
Are both strings taken from the underlying factor graph classes. The first one describes a graph whilst the second describes an individual PriorFactor.
I'll take a look at implementing your suggestions
Is it clearer to represent the Priors as so GaussianPrior(mean = 50.0, sigma = 30.0)?
A potential abuse of notation but you could do something like this for the Analysis Factors
AnalysisFactor0:
gaussian(
centre ~ GaussianPrior(mean = 50.0, sigma = 30.0)
normalization ~ GaussianPrior(mean = 3.0, sigma = 5.0)
sigma ~ GaussianPrior(mean = 3.0, sigma = 5.0)
I presume that there's a tech debt reason behind not using something like TOML or YAML to represent this information?
Literally discussed using YAML the other day. One reason is formatting as for non-trivial models we've designed it quite carefully. There are potentially quite nested models similar to what you put above but then we keep all of the prior parameterisations aligned on the right.