Mark Bajo

Results 4 comments of Mark Bajo

Currently training model for demo. Model is not producing meaningful actions. Still debugging.

Added intermediate reward shaping to the Overcooked environment to encourage cooperative cooking behavior and provide more frequent learning signals. ### Changes - **Onion to pot**: +0.1 reward when an agent...

Hmm... tried training with 1 agent, net can't fully learn how to cook.

@Hadrien-Cr Hello! I have written a concise README describing the rewards and observations. https://github.com/mmbajo/PufferLib/tree/roze-overcooked-dev/pufferlib/ocean/overcooked