guyazran
guyazran
### 🚀 Feature Enable the user to save discounted return instead of accumulated reward in policy evaluation. ### Motivation Currently, EvalCallback and policy evaluation provide the user with accumulated rewards...
## The problem I require a way to build a highly cluttered mujoco model whose assets are loaded dynamically to memory in python. This follows the discussion in issue #1054....
Addresses #418
PyMJCF does not support nested include tags the same way as the MuJoCo compiler. No matter how deep the include tag is, MuJoCo expects all included models and their assets...
In the file `llava/model/llava_arch.py` under the class `LlavaMetaForCausalLM` there is a function`prepare_inputs_labels_for_multimodal` that is called when calling the `generate` and `forward` functions. In lines 411 and 412, the input embeds...