Implementation of hyperopt model selection
This PR aims to implement #1962 into validphys vp-hyperoptplot.py.
Updates
The model selection following to the min_chi2_max_phi algorithm (refactored from best_chi2_worst_phi2 in #1962) could be done in two ways:
1. via a basic script that imports validphys hyperopt_dataframe
An example is shown below:
from pathlib import Path
from validphys.hyperoptplot import hyperopt_dataframe
hyperopt_name = 'hyperopt_10_rep_chi2_average'
# Get the current directory as a Path object
current_directory = Path.cwd()
path = current_directory / hyperopt_name
args = {
'debug': True,
'filter': [],
'hyperopt_folder': path,
'loss_target': 'min_chi2_max_phi', # select Juan & Roy's algorithm
'max_phi_n_models': 10, # select the n lowest values of 1/phi
'val_multiplier': 0.0,
'threshold': 3.0,
'combine': False,
'autofilter': [],
'include_failures': False
}
all_data, best_setup, best_models = hyperopt_dataframe(args)
where best_setup corresponds to the model which shows the lowest 1/phi among those with the lowest chi2, while best_models is a pandas DataFrame containing all max_phi_n_models.
2. via vp-hyperoptplot
by running:
vp-hyperoptplot -l min_chi2_max_phi --max_phi_n_models 10 hyperopt_10_rep_chi2_average -t 3
which would launch an html file in your browser with some statistics and plots (I thinks this could be shared within the NNPDF server later on)
TODO: add a table in the html file containing the detailed specs of best_models.
Can we review plus merge this now @Radonirinaunimi or do we want to wait for ref. replies?
Now that the paper is accepted, should we merge this as is?
@Radonirinaunimi what do we do with this PR?
This is now overly deprecated so let's close it (and in any case it will be removed in favor of the new sampling).