Change Reward function to give points to number of unique pokemon
Hi,
Came across the tutorial on youtube and got the program running. How can I change the reward function to give points to number of unique pokemon? The idea is to try to train the agent to Catch them all.
I also have access to cloud resources of several GPUs. I want to try to run run_baseline_parallel_fast.py to simulate hundreds of games at once to see how fast I can get it to capture 10 unique pokemon.
Hi, You can access the reward function in the the get_game_state_reward function in red_gym_env.py
The memory addresses you need to access are 0xD2F7 - 0xD309 with each bit representing a boolean whether you own that specific pokemon. You can find more memory addresses here. https://datacrystal.romhacking.net/wiki/Pok%C3%A9mon_Red_and_Blue/RAM_map#Pokedex
you could write something along the lines of def get_unique_pokemon(self): for i in range(0,18): pokemon_caught += self.bit_count(self.read_m(0xD2F7 + i)) return pokemon_caught
then add to the state_score dictionary something like 'unique_pokemon': self.reward_scale*self.get_unique_pokemon(),
i would not recommend removing the other rewards as the agent probably wont leave pallet town unless it has some exploration reward but you are welcome to experiment.
good luck!
If it can help you, i went for the same reward.
In the game state rewards
'seen_pokemons': self.reward_scale * self.seen_pokemons
RedGym
def update_seen_pokemons(self):
initial_seen_pokemon = 3 # it seems like it has already encountered a bird
self.seen_pokemons = sum(self.reader.read_seen_pokemons()) - initial_seen_pokemon
Reader (i extracted everything related to memory in a different class)
def read_seen_pokemons(self):
return [self.bit_count(self.read_m(a)) for a in SEEN_POKEMONS_ADDRESSES]
Memory addresses
SEEN_POKEMONS_ADDRESSES = [0xD30A, 0xD30B, 0xD30C, 0xD30D, 0xD30E, 0xD30F, 0xD310, 0xD311, 0xD312, 0xD313, 0xD314, 0xD315, 0xD316, 0xD317, 0xD318, 0xD319, 0xD31A, 0xD31B, 0xD31C]