K.R. Zentner comments

Results 44 comments of


                                            K.R. Zentner

No policy for actions that are tuples of discretes

Hi Steve, Sorry for the slow response. What you want to do should be possible using a Tuple space of Discrete spaces, as you've mentioned. However, the existing policies were...

Pytorch Categorical GRU Policy

Wow, this PR looks good. The main feedback I have is that we don't actually yet have any algorithms implemented in PyTorch that can train an RNN. Unfortunately, our VPG...

Plotter isn't working with torch policies.

@irisliucy @avnishn Is this still happening?

Bug in alpha optimizer in MTSAC

Thanks for reporting this. I suppose the overall effect is that the alpha learning rate is essentially multiplied by the number of tasks being trained. This should be about as...

Constraining the output interval of the GaussianLSTMModel to [0.0 .. 1.0]

I don't believe this is possible with the current API, but it is not difficult to implement. My recommendation is to copy [gaussian_lstm_model.py](https://github.com/rlworkgroup/garage/blob/6461a071f0155712add1b41316003e90c9c77899/src/garage/tf/models/gaussian_lstm_model.py#L16) into your project's source, and modify [line...

Docs page "Usage Guide -> Automatic hyperparameter tuning"

To add to what Avnish said, if you don't use `RaySampler` (and instead use `LocalSampler`), then garage doesn't interact with `ray` at all, and everything _should_ just work out of...

ERROR: Failed building wheel for dm-tree

Yeah, this breakage occurred because dm-tree's build script is broken, and the platform you're on doesn't have a pre-built dm-tree package uploaded to PyPI. Here are a few workarounds you...

Multiple gradient steps in Meta-RL evaluation adaptation

Hi Benedikt. Unfortunately there isn't currently any way of doing that with the `MetaEvaluator`. A pull request to implement it would be appreciated. There is currently the `num_grad_updates` in MAML,...

Rework garage.torch.optimizers

This change does not yet pass tests, but is 90% complete.

Rework garage.torch.optimizers

The core motivation here is to provide a way for recurrent and non-recurrent policies to share the same API at optimization time. However, I definitely agree that making this change...