Use latent embedding or not?
Thanks for the paper and code, and we see that the skill latents is projected using a MLP in the actor but not in the critic. Wonder if this is by design?
yes that's intentional. Adding the mlp to for the latent seems to work a bit better for the actor, but didn't make much of a difference for the critic. So decided to just keep it simple for the critic.
Thanks so much for the reply! I'm also doing research to make the skill embedding more diversified so that mode collapse can be mitigated and more complex downstream work can be achieved, by using the strategy "Explore then Exploit", following APT/CIC paper to maximize the mutual information first, then apply the obs embedding and latent embedding to ASE. I noticed that these papers are coming from the same research group. I'd appreciate if you can advice with any related effort. Have you attempted any similar approach? if yes, what kind of result you've got?