PufferLib icon indicating copy to clipboard operation
PufferLib copied to clipboard

Overcooked Environment

Open mmbajo opened this issue 4 months ago • 5 comments

This PR introduces a Overcooked cooking game environment for PufferLib. Sprites were from OvercookedAI

mmbajo avatar Sep 13 '25 15:09 mmbajo

Currently training model for demo. Model is not producing meaningful actions. Still debugging.

Screenshot 2025-09-14 at 0 30 36

mmbajo avatar Sep 13 '25 15:09 mmbajo

Added intermediate reward shaping to the Overcooked environment to encourage cooperative cooking behavior and provide more frequent learning signals.

Changes

  • Onion to pot: +0.1 reward when an agent adds an onion to a pot
  • Correct recipe start: +0.1 reward when starting to cook a pot with exactly 3 onions (the target recipe)
  • Soup plating: +0.1 reward when transferring a cooked soup from pot to plate
  • Dish serving:
    • Correct recipe (3 onions): +5.0 to serving agent, +20.0 to all agents
    • Incorrect recipe: +0.1 to all agents (small consolation reward)

I am now getting decent performance trajectories when training. But I do need some help. Still not sure if this will workout. 🙇 Screenshot 2025-09-22 at 21 36 27

Explained Variance in the positive region! I assume this is a good sign? Screenshot 2025-09-22 at 21 37 30

Any advice what to try? or change?

mmbajo avatar Sep 22 '25 12:09 mmbajo

Hmm... tried training with 1 agent, net can't fully learn how to cook. Screenshot 2025-09-26 at 14 27 48

mmbajo avatar Sep 26 '25 05:09 mmbajo

do you mind describing your reward structure and rules ? maybe i can help

Hadrien-Cr avatar Sep 27 '25 09:09 Hadrien-Cr

@Hadrien-Cr Hello! I have written a concise README describing the rewards and observations. https://github.com/mmbajo/PufferLib/tree/roze-overcooked-dev/pufferlib/ocean/overcooked

mmbajo avatar Oct 09 '25 04:10 mmbajo