MultiBench icon indicating copy to clipboard operation
MultiBench copied to clipboard

What's the meaning of modalities in MUJOCO PUSH dataset?

Open mrbeann opened this issue 3 years ago • 2 comments

Hi, I recently tried the MUJOCO PUSH dataset, but I cannot figure out the concrete meaning of the modalities. The paper mentioned

The multimodal inputs are gray-scaled images (1 × 32 × 32) from an RGB camera, forces (and binary contact information) from a force/torque sensor, and the 3D position of the robot end-effector.

I found the modality in the dataset are "control", "image", "sensor", "pos". What are the correspondences between these modalities and the paper? (i.e. what's the meaning of these modalities?).

mrbeann avatar May 26 '22 13:05 mrbeann

Someone else can confirm, but here's how I think of things: -> The "image" modality refers to the gray-scale images. -> The "pos" modality refers to the 3d position of the end-effector. -> The "sensor" refers to the forces/binary contact information. -> The "control" refers to what the controller is sending the arm itself. ( This one I'm the least sure about ).

arav-agarwal2 avatar May 27 '22 18:05 arav-agarwal2

I agree with your ideas, but this does not seem to correspond to the paper? For example, Figure 8.

mrbeann avatar May 28 '22 02:05 mrbeann