PyAutoFit icon indicating copy to clipboard operation
PyAutoFit copied to clipboard

Suggestions for graph.info

Open Jammy2211 opened this issue 4 years ago • 5 comments

My graph.info file appears as follows:

(AnalysisFactor0*AnalysisFactor1*AnalysisFactor2*PriorFactor0*PriorFactor1*PriorFactor2*PriorFactor3*PriorFactor4*PriorFactor5*PriorFactor6)

AnalysisFactor0

gaussian
    centre                                                                                GaussianPrior, mean = 50.0, sigma = 30.0
    normalization                                                                         GaussianPrior, mean = 3.0, sigma = 5.0
    sigma                                                                                 GaussianPrior, mean = 10.0, sigma = 10.0

AnalysisFactor1

gaussian
    centre                                                                                GaussianPrior, mean = 50.0, sigma = 30.0
    normalization                                                                         GaussianPrior, mean = 3.0, sigma = 5.0
    sigma                                                                                 GaussianPrior, mean = 10.0, sigma = 10.0

AnalysisFactor2

gaussian
    centre                                                                                GaussianPrior, mean = 50.0, sigma = 30.0
    normalization                                                                         GaussianPrior, mean = 3.0, sigma = 5.0
    sigma                                                                                 GaussianPrior, mean = 10.0, sigma = 10.0

Factor(PriorFactor0, x=GaussianPrior, mean = 50.0, sigma = 30.0)

Factor(PriorFactor1, x=GaussianPrior, mean = 3.0, sigma = 5.0)

Factor(PriorFactor2, x=GaussianPrior, mean = 10.0, sigma = 10.0)

Factor(PriorFactor3, x=GaussianPrior, mean = 3.0, sigma = 5.0)

Factor(PriorFactor4, x=GaussianPrior, mean = 10.0, sigma = 10.0)

Factor(PriorFactor5, x=GaussianPrior, mean = 3.0, sigma = 5.0)

Factor(PriorFactor6, x=GaussianPrior, mean = 10.0, sigma = 10.0)

Does this line (AnalysisFactor0*AnalysisFactor1*AnalysisFactor2*PriorFactor0*PriorFactor1*PriorFactor2*PriorFactor3*PriorFactor4*PriorFactor5*PriorFactor6) offer any useful information?

I think it does, as one might have a lot of analysis factors and prior factors which it summarizes. But perhaps we could write it as follows:

AnalysisFactors:

   AnalysisFactor0
   AnalysisFactor1
   AnalysisFactor2

Factors:

  centre                                                                           Factor(PriorFactor0, x=GaussianPrior, mean = 50.0, sigma = 30.0)
  normalization                                                               Factor(PriorFactor1, x=GaussianPrior, mean = 3.0, sigma = 5.0)
  sigma                                                                           Factor(PriorFactor2, x=GaussianPrior, mean = 10.0, sigma = 10.0)
  normalization                                                               Factor(PriorFactor3, x=GaussianPrior, mean = 3.0, sigma = 5.0)
  sigma                                                                           Factor(PriorFactor4, x=GaussianPrior, mean = 10.0, sigma = 10.0)
  normalization                                                               Factor(PriorFactor5, x=GaussianPrior, mean = 3.0, sigma = 5.0)
   sigma                                                                          Factor(PriorFactor6, x=GaussianPrior, mean = 10.0, sigma = 10.0)

Does the word Factor in front of these tell us anything? Could we reduce this too:

Factors:

  centre                                                                           PriorFactor0, x=GaussianPrior, mean = 50.0, sigma = 30.0
  normalization                                                               PriorFactor1, x=GaussianPrior, mean = 3.0, sigma = 5.0
  sigma                                                                           PriorFactor2, x=GaussianPrior, mean = 10.0, sigma = 10.0
  normalization                                                               PriorFactor3, x=GaussianPrior, mean = 3.0, sigma = 5.0
  sigma                                                                           PriorFactor4, x=GaussianPrior, mean = 10.0, sigma = 10.0
  normalization                                                               PriorFactor5, x=GaussianPrior, mean = 3.0, sigma = 5.0
   sigma                                                                          PriorFactor6, x=GaussianPrior, mean = 10.0, sigma = 10.0

In fact, I really like the list of each indivudual AnalysisFactor, which coverts the priors, so how about:

Factors:

  centre                                                                           **PriorFactor0 --> [AnalysisFactor0, AnalysisFactor1, AnalysisFactor2]**
  normalization                                                               PriorFactor1 --> AnalysisFactor0
  sigma                                                                           PriorFactor2 --> AnalysisFactor0
  normalization                                                               PriorFactor3 --> AnalysisFactor1
  sigma                                                                           PriorFactor4 --> AnalysisFactor1
  normalization                                                               PriorFactor5 --> AnalysisFactor2
   sigma                                                                          PriorFactor6 --> AnalysisFactor2

AnalysisFactors:

AnalysisFactor0

gaussian
    centre (PriorFactor0)                                                                       GaussianPrior, mean = 50.0, sigma = 30.0
    normalization (PriorFactor1)                                                           GaussianPrior, mean = 3.0, sigma = 5.0
    sigma (PriorFactor2)                                                                        GaussianPrior, mean = 10.0, sigma = 10.0

AnalysisFactor1

gaussian
    centre (PriorFactor0)                                                                        GaussianPrior, mean = 50.0, sigma = 30.0
    normalization (PriorFactor3)                                                           GaussianPrior, mean = 3.0, sigma = 5.0
    sigma (PriorFactor4)                                                                       GaussianPrior, mean = 10.0, sigma = 10.0

AnalysisFactor2

gaussian
    centre (PriorFactor0)                                                                     GaussianPrior, mean = 50.0, sigma = 30.0
    normalization  (PriorFactor5)                                                        GaussianPrior, mean = 3.0, sigma = 5.0
    sigma  (PriorFactor6)                                                                    GaussianPrior, mean = 10.0, sigma = 10.0

In bold I marked a potential pitfall, whereby for many AnalysisFactors the file will blow up. So maybe after a certain number we put something more generic like PriorFactor0 --> x13 AnalysisFactor.

This file was made for a dataset with 3 pieces of data. If we scaled this up to have 500+s individual datasets (each with ~3 model parameters) the file would blow up again. I guess we need to think about whether the declarative framework can exploit symmetries in the graph (if they are present) to reduce the information to something really concise, like:

Factors:

  centre                                                                           **PriorFactor0 -->x500 AnalysisFactor**
  normalization                                                               x500 PriorFactor
  sigma                                                                           x500 PriorFactor

AnalysisFactors:

AnalysisFactor (x500)

gaussian
    centre (PriorFactor0)                                                                       GaussianPrior, mean = 50.0, sigma = 30.0
    normalization (x500 PriorFactor)                                                     GaussianPrior, mean = 3.0, sigma = 5.0
    sigma (x500 PriorFactor)                                                                 GaussianPrior, mean = 10.0, sigma = 10.0

Obviously, the template above is crap and we need to think more carefully about this. Feels like th topology problem we had when we were discussing visualizing these things.

Jammy2211 avatar Nov 18 '21 18:11 Jammy2211

So (AnalysisFactor0AnalysisFactor1AnalysisFactor2PriorFactor0PriorFactor1PriorFactor2PriorFactor3PriorFactor4PriorFactor5*PriorFactor6)

and Factor(PriorFactor0, x=GaussianPrior, mean = 50.0, sigma = 30.0)

Are both strings taken from the underlying factor graph classes. The first one describes a graph whilst the second describes an individual PriorFactor.

I'll take a look at implementing your suggestions

rhayes777 avatar Nov 24 '21 09:11 rhayes777

Is it clearer to represent the Priors as so GaussianPrior(mean = 50.0, sigma = 30.0)?

matthewghgriffiths avatar Nov 25 '21 15:11 matthewghgriffiths

A potential abuse of notation but you could do something like this for the Analysis Factors

AnalysisFactor0:
    gaussian(
        centre ~ GaussianPrior(mean = 50.0, sigma = 30.0)
        normalization ~ GaussianPrior(mean = 3.0, sigma = 5.0)
        sigma ~ GaussianPrior(mean = 3.0, sigma = 5.0)

matthewghgriffiths avatar Nov 25 '21 15:11 matthewghgriffiths

I presume that there's a tech debt reason behind not using something like TOML or YAML to represent this information?

matthewghgriffiths avatar Nov 25 '21 15:11 matthewghgriffiths

Literally discussed using YAML the other day. One reason is formatting as for non-trivial models we've designed it quite carefully. There are potentially quite nested models similar to what you put above but then we keep all of the prior parameterisations aligned on the right.

rhayes777 avatar Nov 25 '21 15:11 rhayes777