BayesianTools icon indicating copy to clipboard operation
BayesianTools copied to clipboard

Automatic thinning in plots

Open florianhartig opened this issue 7 years ago • 4 comments

Possibly better to have all plots automatically thin the chain to n=5000 or so? Usually it doesn't improve the plots to have a higher n, and makes everything much slower

florianhartig avatar Jul 16 '18 08:07 florianhartig

This could be implemented by checking for numSamples in ... (which should always be passed to getSample), and replace it with 5000 if we want to keep this consistent among all plotting functions.

TankredO avatar Jul 17 '18 11:07 TankredO

Dear Florian, I enjoyed using this package, but this issue does highlight some problems I encountered when fitting heavier models.

I fit models with >5e6 samples in them per chain. In these cases the plots are pretty useless, and ridiculously slow. Automatic thinning would have been great (as far as I am concerned 5e6 is not unusually large number).

Also, do you know if there is a way to throw out the first XX% of the samples within a fit sequence? As the default is to save all samples, the out object gets very very bloated, because as far as I can tell, it always saves all samples. I have been forced to revert to my own MCMC code as some models just fill the memory to capacity. I didn't find anything on this in the documentation. However, maybe I missed something.

Thanks for your efforts in building this package,

M.

MarcoDVisser avatar Mar 19 '20 12:03 MarcoDVisser

Hi Marco,

yeah, we should fix the automatic thing, but you can do everything you want by hand. Note that in nearly all BT functions, the ... argument is forwarded to getSample, so you can thin and select via this. Consider this example.

# Generate a test likelihood function. 
ll <- generateTestDensityMultiNormal(sigma = "no correlation")

## Create a BayesianSetup object from the likelihood 
## is the recommended way of using the runMCMC() function.
bayesianSetup <- createBayesianSetup(likelihood = ll, lower = rep(-10, 3), upper = rep(10, 3))

## Finally we can run the sampler and have a look
settings = list(iterations = 100000, adapt = FALSE)
out <- runMCMC(bayesianSetup = bayesianSetup, sampler = "Metropolis", settings = settings)

## out is of class bayesianOutput. There are various standard functions 
# implemented for this output

# see ?getSample for options to select plots
getSample(out, start = 1000, thin = 1000)

# all options of getSample can be used in all plot and summary functions

plot(out, thin = 1000)
correlationPlot(out, thin = 1000)
marginalPlot(out, thin = 1000)
summary(out)

About not saving the stuff in the first place - this is also implemented, see the help of the sampler you want to use, so e.g. ?DEzs

There are a few minor bugs when you use this (e.g. the thinning is not correctly forwarded to coda functions), but the samples are OK I think

florianhartig avatar Mar 19 '20 14:03 florianhartig

Great, I missed the option to apply the thinning during the fit. Will try.

MarcoDVisser avatar Mar 19 '20 14:03 MarcoDVisser