PokemonRedExperiments icon indicating copy to clipboard operation
PokemonRedExperiments copied to clipboard

Change Reward function to give points to number of unique pokemon

Open martin-cala1 opened this issue 1 year ago • 2 comments

Hi,

Came across the tutorial on youtube and got the program running. How can I change the reward function to give points to number of unique pokemon? The idea is to try to train the agent to Catch them all.

I also have access to cloud resources of several GPUs. I want to try to run run_baseline_parallel_fast.py to simulate hundreds of games at once to see how fast I can get it to capture 10 unique pokemon.

martin-cala1 avatar Feb 11 '24 03:02 martin-cala1

Hi, You can access the reward function in the the get_game_state_reward function in red_gym_env.py

The memory addresses you need to access are 0xD2F7 - 0xD309 with each bit representing a boolean whether you own that specific pokemon. You can find more memory addresses here. https://datacrystal.romhacking.net/wiki/Pok%C3%A9mon_Red_and_Blue/RAM_map#Pokedex

you could write something along the lines of def get_unique_pokemon(self): for i in range(0,18): pokemon_caught += self.bit_count(self.read_m(0xD2F7 + i)) return pokemon_caught

then add to the state_score dictionary something like 'unique_pokemon': self.reward_scale*self.get_unique_pokemon(),

i would not recommend removing the other rewards as the agent probably wont leave pallet town unless it has some exploration reward but you are welcome to experiment.

good luck!

Xe-Xo avatar Feb 11 '24 06:02 Xe-Xo

If it can help you, i went for the same reward.

In the game state rewards

'seen_pokemons': self.reward_scale * self.seen_pokemons

RedGym

    def update_seen_pokemons(self):
        initial_seen_pokemon = 3  # it seems like it has already encountered a bird
        self.seen_pokemons = sum(self.reader.read_seen_pokemons()) - initial_seen_pokemon

Reader (i extracted everything related to memory in a different class)

    def read_seen_pokemons(self):
        return [self.bit_count(self.read_m(a)) for a in SEEN_POKEMONS_ADDRESSES]

Memory addresses

    SEEN_POKEMONS_ADDRESSES = [0xD30A, 0xD30B, 0xD30C, 0xD30D, 0xD30E, 0xD30F, 0xD310, 0xD311, 0xD312, 0xD313, 0xD314, 0xD315, 0xD316, 0xD317, 0xD318, 0xD319, 0xD31A, 0xD31B, 0xD31C]

mdreano avatar Feb 28 '24 13:02 mdreano