atari icon indicating copy to clipboard operation
atari copied to clipboard

Exponential increase in loss

Open hariharan-jayakumar opened this issue 5 years ago • 0 comments

Hi @gsurma,

Thank you for the wonderful code and the medium article. I tried implementing your code but found that the loss function in my model shoots off after some time.

These are the hyper-parameters I used:

initialize environment

env = MainGymWrapper.wrap(gym.make('SpaceInvaders-v0')) #env = gym.make('SpaceInvaders-v0')

define hyperparameters

total_step_limit = 5000000 wandb.config.episodes = 1000 GAMMA = 0.99 MEMORY_SIZE = 350000 BATCH_SIZE = 32 TRAINING_FREQUENCY = 4 TARGET_NETWORK_UPDATE_FREQUENCY = 40000 MODEL_PERSISTENCE_UPDATE_FREQUENCY = 10000 REPLAY_START_SIZE = 50000 action_size = env.action_space.n EXPLORATION_MAX = 1.0 EXPLORATION_MIN = 0.1 EXPLORATION_TEST = 0.02 EXPLORATION_STEPS = 425000 EXPLORATION_DECAY = (EXPLORATION_MAX-EXPLORATION_MIN)/EXPLORATION_STEPS wandb.config.batch_size = 32 wandb.config.learning_rate = 0.00025 input_shape = (4, 84, 84)

The CNN is the same. I also used np.sign for the rewards I got.

Can you guide me on what might be possibly going wrong?

Capture

hariharan-jayakumar avatar Mar 15 '20 13:03 hariharan-jayakumar