Ax icon indicating copy to clipboard operation
Ax copied to clipboard

Setting up robust optimization experiment with MVaR gives error

Open apaleyes opened this issue 2 years ago • 6 comments

Hi! I am trying to setup a Robust optimisation experiment with Ax. There is no tutorial on how to do it, so i pieced something together from unit tests. However, if I am using MVaR as a risk measure, it all errors out.

The complete code is below (without imports), it just follows these two files: https://github.com/facebook/Ax/blob/main/ax/modelbridge/tests/test_robust_modelbridge.py https://github.com/facebook/Ax/blob/main/ax/utils/testing/core_stubs.py#L263

The key point in the code is the risk measure definition:

risk_measure = RiskMeasure(
    risk_measure="MultiOutputExpectation",
    options={"n_w": 16},
)

This works. But if we replace it with MVaR:

risk_measure = RiskMeasure(
    risk_measure="MVaR",
    options={"n_w": 16, "alpha": 0.8},
)

We get the following error after approx. 14 sec wait:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Versions, if necessary: botorch 0.9.4 gpytorch 1.11 ax-platform 0.3.5

Any idea why this might be happening? Thanks!


Complete code:

x1_dist = ParameterDistribution(
        parameters=["x1"], distribution_class="norm", distribution_parameters={}
    )

search_space = RobustSearchSpace(
    parameters=[
        RangeParameter(
            name="x1", parameter_type=ParameterType.FLOAT, lower=-5, upper=10
        ),
        RangeParameter(
            name="x2", parameter_type=ParameterType.FLOAT, lower=0, upper=15
        ),
    ],
    parameter_distributions=[x1_dist],
    num_samples=16,
)

risk_measure = RiskMeasure(
    risk_measure="MultiOutputExpectation",
    options={"n_w": 16},
)
metrics = [
    BraninMetric(
        name=f"branin_{i}", param_names=["x1", "x2"], lower_is_better=True
    )
    for i in range(2)
]
optimization_config = MultiObjectiveOptimizationConfig(
    objective=MultiObjective(
        [
            Objective(
                metric=m,
                minimize=True,
            )
            for m in metrics
        ]
    ),
    objective_thresholds=[
        ObjectiveThreshold(metric=m, bound=10.0, relative=False)
        for m in metrics
    ],
    risk_measure=risk_measure,
)

exp = Experiment(
    name="branin_experiment",
    search_space=search_space,
    optimization_config=optimization_config,
    runner=SyntheticRunner(),
)

sobol = get_sobol(search_space=exp.search_space)
for _ in range(2):
    exp.new_trial(generator_run=sobol.gen(1)).run().mark_completed()


for _ in range(5):
    modelbridge = Models.BOTORCH_MODULAR(
        experiment=exp,
        data=exp.fetch_data(),
        surrogate=Surrogate(botorch_model_class=SingleTaskGP),
        botorch_acqf_class=qNoisyExpectedHypervolumeImprovement,
    )
    trial = (
        exp.new_trial(generator_run=modelbridge.gen(1)).run().mark_completed()
    )

apaleyes avatar Dec 13 '23 13:12 apaleyes

MVaR is not differentiable, so gradient issues not terribly surprising.

To get unblocked on this, a recommended alternative is to use MARS (https://proceedings.mlr.press/v162/daulton22a.html) which is way faster and differentiable than directly optimizing MVaR with qNEHVI. You can use MARS by instead setting

risk_measure = RiskMeasure( risk_measure="MARS", options={"n_w": 16, "alpha": 0.8}, ) and

modelbridge = Models.BOTORCH_MODULAR(
        experiment=exp,
        data=exp.fetch_data(),
        surrogate=Surrogate(botorch_model_class=SingleTaskGP),
        botorch_acqf_class=qLogNoisyExpectedImprovement,
)
```.

sdaulton avatar Dec 15 '23 18:12 sdaulton

Hi @apaleyes. The code you shared runs fine for me, on Ax 0.3.6. I don't think there were any changes to this part of the code recently, so I don't know why you'd be getting an error due to gradients.

Can you try again with the latest versions of Ax & BoTorch? If you get the error again, sharing the full stack trace could be helpful to identify where the error is coming from.

saitcakmak avatar Dec 15 '23 19:12 saitcakmak

Oh, I copy pasted the code and didn't realize that it was using expectation rather than MVaR. I can reproduce the issue after updating that

saitcakmak avatar Dec 15 '23 20:12 saitcakmak

Ok, the issue is that the MVaR implementation in BoTorch is not differentiable. The code has a warning on this but it is easy to miss when you get an error: https://github.com/pytorch/botorch/blob/main/botorch/acquisition/multi_objective/multi_output_risk_measures.py#L498-L505

We do have a version of it with approximate gradients but looks like that change was never upstreamed to BoTorch.

saitcakmak avatar Dec 15 '23 20:12 saitcakmak

Thanks, @sdaulton , that unblocked me indeed! Can I ask why your code uses qLogNoisyExpectedImprovement and not its hypervolume counterpart?

@saitcakmak glad it reproduced, thanks for responding with the fix so quickly

apaleyes avatar Dec 15 '23 23:12 apaleyes

Glad that unblocked you! MARS optimizes MVaR by optimizing the VaR of random Chebyshev scalarizations. Since it scalarizes the problem, it uses a single-objective acquisition function.

sdaulton avatar Dec 16 '23 00:12 sdaulton

@saitcakmak, did the differentiable MVaR version resolve the NaN issue?

sdaulton avatar Jan 23 '24 22:01 sdaulton

Yep, the error is resolved with the differentiability support.

saitcakmak avatar Jan 23 '24 22:01 saitcakmak