Ax [Question] Changing parameter constraints before generating a new trial

I'm wondering if it's possible to adjust parameter constraints after they've already been set. As an example, take a 'parameter' that theoretically, its range could be between [a, b]. However, consistently achieving a specific value may be difficult in real-world situations. For practicality, one might want that the suggested trial experiment uses a specific, predetermined value of the 'parameter' instead of just any value from the range. The challenge is that experiments are time-intensive, and I won't know the values of such 'parameter' ahead of time, making it tough to set the search space from the outset. Is there a way to navigate this?

Edit: As extra context, in the documentation of human in the loop tutorial it says:

Without a SearchSpace, our models are unable to generate new candidates. By default, the models will read the search space off of the experiment, when they are told to generate candidates. SearchSpaces can also be specified by the user at this time. Sometimes, the first round of an experiment is too restrictive--perhaps the experimenter was too cautious when defining their initial ranges for exploration! In this case, it can be useful to generate candidates from new, expanded search spaces, beyond that specified in the experiment.

the search space can get expanded, but can it get smaller without deleting previous runs that would violate the new constraint?

Oct 25 '23 17:10 Jgmedina95

Hi, thinking about what you are describing here it seems like ChoiceParameter may actually be a better fit for your usecase than RangeParameter: https://ax.dev/api/_modules/ax/core/parameter.html#ChoiceParameter. With choice parameter you can define a set of acceptable parameters ie [5,10,15] and then the selection happens from that set. What do you think?

Oct 30 '23 17:10 mgarrard

Thought about it. The problem is that in this particular case. the parameter can be anywhere between two values [a,b]. Is just that the variable is measured, but not controlled. In my case is the dimensions of a nanoparticle, and the synthesis procedure doesn't control it very well, but they are always in a determined range. I opted to work in ranges (similar to choice), but each option is a category [small , medium, big] as that is easier to control, rather than the exact value.

Oct 30 '23 18:10 Jgmedina95

This sounds a bit like the robust optimization problem that @saitcakmak worked on in the past. I wonder if he'd have suggestions here!

Oct 30 '23 18:10 lena-kashtelyan

@Jgmedina95, one thing I'm wondering is whether what you are currently formulating as a parameter, is actually a metric value? I would suggest that you use the Ax Service API, tutorial: https://ax.dev/tutorials/gpei_hartmann_service.html (much easier to use for most Ax use cases) and post a code snippet showing us your code, along with some data you've obtained in the experiment so far. It's a bit hard to understand the issue without this.

Oct 30 '23 19:10 lena-kashtelyan

Sure! Actually im using a little different idea. I can simplify it as follows. Right now this is how is working:

class ExactGPModel(gpytorch.models.ExactGP, GPyTorchModel):
    
    _num_outputs = 1 

    def __init__(self, train_X, train_Y,**kwargs):
        super().__init__(train_X, train_Y.squeeze(-1), GaussianLikelihood(), **kwargs)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
        self.to(train_X)
        
    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

gs = GenerationStrategy(
    steps=[
        GenerationStep(
            model=Models.BOTORCH_MODULAR,
            num_trials=-1,  # No limitation on how many trials should be produced from this step
            # For `BOTORCH_MODULAR`, we pass in kwargs to specify what surrogate or acquisition function to use.
            model_kwargs={
                "surrogate":Surrogate(ExactGPModel),
                
                "botorch_acqf_class": qExpectedImprovement,
                },
        ),
    ]
)

the parameters would be this:

ax_parameters = [
    {
        "name": "Parameter1",
        "type": "range",
        "bounds": [12.0000,51.00000],
        "value_type":'float'
    },
    {
        "name": "Parameter2",
        "type": "range",
        "bounds": [6,26],
        "value_type": 'float'
    },
    {
        "name": "Parameter3",
        "type": "range",
        "bounds": [0.12,0.42]
    },
    {   "name":"Parameter4l",
        "type":"choice",
        "values": [0.25,0.5]
    },].  #i actually have 9 parameters but for illustrative purposes Im stoping here.

ax_client = AxClient(generation_strategy = gs)
ax_client.create_experiment(parameters = ax_parameters, objectives= {"f":ObjectiveProperties(minimize=False)},)

Where f is a value that can be obtained experimentally. Because i already have data (gathered through several months) i add it as initial trials.

#add data to ax_client
for i in range(len(modified_features)):
    ax_client.attach_trial(parameters = {ax_parameters[j]['name']: modified_features.values[i][j] for j in range(9)})
    ax_client.complete_trial(trial_index = i, raw_data = {"f": train_final_label.values[i]})

#and finally i get a new trial
parameters, trial_index = ax_client.get_next_trial()

My problem is that parameter1 is measured during the experiment, because is heavily related to the metric i want to optimize. Therefore, if my trial suggests (for example) Parameter1: 30, I cant really control the exact value. Which is why i wanted to see if knowing in advance the parameter1 i could found the rest of parameters.

If you're thinking, just constrain from the beginning when defining the parameters. The problem is that the data I'm adding the range is bigger, and therefore some points would be not considered in the dataset.

Let me know if i need to explain more :)

I guess the naive approach that i originally tried will be helpful too:

Before asking for a new trial, lets say i know parameter1 is 19. If i add the constraint in the experiment like this:

from ax.core.parameter_constraint import (
    ComparisonOp,
    OrderConstraint,
    ParameterConstraint,
    SumConstraint)

parameter_constraints = [ParameterConstraint(
            constraint_dict={"Dimension1": 1, }, bound=20.0
        )]
ax_client.experiment.search_space.set_parameter_constraints(parameter_constraints)

When i query for the next trial:

parameters, trial_index = ax_client.get_next_trial()
This message appears:
INFO 10-30 15:31:31] ax.modelbridge.base: Leaving out out-of-design observations for arms: 15_0, 49_0, 43_0, 47_0, 19_0, 33_0, 25_0, 30_0, 37_0, 29_0, 46_0, 38_0, 42_0, 26_0, 35_0, 41_0, 23_0, 18_0, 17_0, 22_0, 44_0, 40_0, 28_0, 32_0, 39_0, 14_0, 48_0, 36_0, 34_0, 45_0, 31_0, 16_0, 21_0, 27_0, 24_0, 13_0, 20_0

Which is something i dont want.

Oct 30 '23 19:10 Jgmedina95

I think @Balandat as our Modeling & Optimization oncall is best suited to help here; cc @saitcakmak also who might have thoughts.

Nov 01 '23 16:11 lena-kashtelyan

So if I read this correctly, "parameter1" in this setting isn't really a tunable parameter but instead an observed feature? That is, its value can help explain the behavior of / variation in f, but we cannot control its value as part of the experiment?

If that's true then I would consider this what we'd call a "contextual feature". There are a couple of possible scenarios here:

We know the value of this feature prior to needing to suggest a candidate parameterization. In this case we can generate such parameterization conditional on the feature value.
We don't know the value of the feature prior to needing to suggest a candidate parameterization (i.e. we choose a parametrization and then the feature value is revealed to us). In this case what one may typically do is optimize w.r.t. to the expected outcome over some distribution of that feature.

Am I understanding this setting correctly? If so, then this is a relatively advanced setting and we don't have great out-of-the-box support for this right now (but we're working on it). If you're in setting 2) then a workaround for now may be to (i) consider the contextual feature as a parameter while define a sufficiently large search space that covers the expected range, (ii) for each step in the optimization loop, use a "fixed feature" to set the value of this "parameter" to the observed value (e.g. via https://github.com/facebook/Ax/blob/main/ax/modelbridge/base.py#L748). The downside of this is that I don't believe this "feature fixing" is currently exposed in the Service API of AxClient (though it shouldn't be too hard to do that).

Nov 01 '23 17:11 Balandat

Hi @Balandat, thank you for your valuable insights!

Upon reflection, my situation aligns more closely with your first point. In my context, the time required for experiments (labeling trials) significantly exceeds that of the actual optimization loop, so the second approach you've mentioned seems quite appealing too.

One idea I'm contemplating, which is only feasible due to these extended labeling periods, involves making predictions with the already trained model. I would fix Parameter1 to its known value and vary the remaining search space parameters. I recognize that this method resembles a greedy search instead of Expected Improvement, but given the constraints, it might still be a practical temporary solution.

Nov 02 '23 13:11 Jgmedina95

One idea I'm contemplating, which is only feasible due to these extended labeling periods, involves making predictions with the already trained model. I would fix Parameter1 to its known value and vary the remaining search space parameters. I recognize that this method resembles a greedy search instead of Expected Improvement, but given the constraints, it might still be a practical temporary solution.

I am not sure I fully understand - Would the idea be to predict the outcomes across some kind of grid of parameter values (of the other parameters, while parameter1 is fixed), and then do some greedy selection based on those predictions? I think the "predict on a dense grid" approach would be reasonable if (i) you want to avoid diving into the lower level components of Ax where you can actually fix the parameter for acquisition function optimization, and (ii) your search space is relatively low dimensional (maybe <=4-5 or so, otherwise you'd need too many samples to cover the space densely).

But even if you were to do this, I would recommend not picking candidates in a greedy fashion based on the posterior mean prediction; you can still compute the acquisition function (e.g. expected improvement) on the individual predictions and select the next point based on that.

I suggest you check out the lower level library components as described in https://ax.dev/tutorials/gpei_hartmann_developer.html and then using the fixed_features in the gen call to condition on the value of your parameter1 as this would be the "proper" thing to do (as far as I correctly understand your setup).

Nov 02 '23 13:11 Balandat

Just wanted to thank you @Balandat for your help and guidance. Im still learning the framework so it was a little hard to get done. In any case, ill share some lines of code of my final implementation :)

from ax import (
    ComparisonOp,
    ParameterType,
    RangeParameter,
    ChoiceParameter,
    FixedParameter,
    SearchSpace,
    Experiment,
    OutcomeConstraint,
    OrderConstraint,
    SumConstraint,
    OptimizationConfig,
    Objective,
    Metric,
)
#optimization_config =  {"f":ObjectiveProperties(minimize=False)}
objective_metric = Metric(name="f", lower_is_better=None)  

class MyRunner(Runner):

    def run(self, trial):
        trial_metadata = {"name": str(trial.index)}
        return trial_metadata

# Define the search space based on the ax_parameters
search_space = SearchSpace(
    parameters=[
        RangeParameter(
            name=param["name"], 
            parameter_type=ParameterType.FLOAT, 
            lower=float(param["bounds"][0]), 
            upper=float(param["bounds"][1])
        )
        if param["type"] == "range" else
        ChoiceParameter(
            name=param["name"],
            values=param["values"],
            parameter_type=ParameterType.FLOAT
        )
        for param in ax_parameters
    ]
)
experiment = Experiment(
    name="test_f",
    search_space=search_space,
    optimization_config=OptimizationConfig(objective=Objective(objective_metric, minimize=False)),
    runner=MyRunner(),
)
experiment.warm_start_from_old_experiment(ax_client.generation_strategy.experiment).  ##just reusing the data I already had initialized 
model_bridge_with_GPEI = Models.BOTORCH_MODULAR(
    experiment=experiment,
    data=data,
    surrogate=Surrogate(BaseGPMatern),  # Optional, will use default if unspecified
    botorch_acqf_class=qExpectedImprovement,  # Optional, will use default if unspecified
)

generator_run = model_bridge_with_GPEI.gen(n=1,fixed_features=ObservationFeatures({'Dimension1':31.0,'Dimension2':7.0}))
trial = experiment.new_trial(generator_run=generator_run)

The trial will only include parameters with such dimensions! Still don't understand what the Runner is doing though haha

Nov 07 '23 05:11 Jgmedina95

Still don't understand what the Runner is doing though haha

The purpose of the Runner in general is to abstract away how exactly you'd evaluate a Trial provide a common API for that so that the same code can use different Runners to deploy to different evaluation setups. The counterpart of the runner is the Metric that is used to retrieve the results of the trial run.

It's not strictly necessary to use either though; once you've generated a trial with a parameterization in your setup above, you can evaluate that however you'd like and then attach the data to the experiment via the attach_data method https://github.com/facebook/Ax/blob/main/ax/core/experiment.py#L682-L687l here Data is essentially a wrapper around a pandas Dataframe with the following columns: arm_name, metric_name (in your case "f"), mean (the observed outcome) and sem (the standard error of the noise in your observed outcome, if any). See e.g. the BoothMetric returning such an object in this tutorial: https://ax.dev/tutorials/gpei_hartmann_developer.html

Nov 07 '23 06:11 Balandat

@Jgmedina95 closing out this issue since it's been a while, but please feel free to re-open or open a new issue if additional follow up is needed.

Jul 24 '24 19:07 mgarrard