Joan
Joan
Did you end up solving this? I have been bumping my head onto this the entire day....
Hi all, I am having trouble installing MuJoCo 2.1.1, specifically over this: ``` Successfully installed mujoco-py-2.1.2.14 Compiling /Users/joanvelja/miniconda3/envs/mujoco_env/lib/python3.9/site-packages/mujoco_py/cymj.pyx because it changed. [1/1] Cythonizing /Users/joanvelja/miniconda3/envs/mujoco_env/lib/python3.9/site-packages/mujoco_py/cymj.pyx ld: warning: duplicate -rpath '/Users/joanvelja/miniconda3/envs/mujoco_env/lib' ignored...
Reward models are often based off of AutoModelForSequenceClassification (always, iirc). Though setting `task='classify'` is not hot it would be intended to work, given that the output that matters is the...
It is still unclear to me how to get reward modelling scores this way. Classify is not set up when setting type == reward, and the pooling docs is quite...
The docs clearly claim that the encode method simply returns the hidden states, and fwiw redirecting to different parts in the docs without a clear logic isn't helpful.
No worries, I was preparing an issue + suggested PR on how to handle the situation. Was able to do exactly what you suggested right now. Appreciate the patience!
@vwxyzjn Did you find a solution to this? I sent you a twitter DM too, since I would love to do the same for a multi-agent RL pipeline I have...