Shehper
Shehper
Hi! The batch size of nanoGPT is batch_size*gradient_accumulation_steps = 12*40 = 480. The batch size mentioned in the GPT-2 paper is 512. May I ask why nanoGPT was trained with...
The code, as written, does not create equally distributed classes.
While running inference on my Mac with MacOS version 13.1, I received the following error: ``` RuntimeError: MPS does not support cumsum_out_mps op with int64 input. Support has been added...
### Problem Description In examples/alphazero/train.py, we compute `value_mask` as follows: https://github.com/sotetsuk/pgx/blob/87278d2d6e677fd87248c457207b59cfa42e578d/examples/alphazero/train.py#L179 The purpose is to avoid updating the critic network on incomplete trajectories, as is evident by masking of value...
In `ppo.py` and `ppo_atari.py`, episodic information is logged [as follows](https://github.com/vwxyzjn/cleanrl/blob/1ed80620842b4cdeb1edc07e12825dff18091da9/cleanrl/ppo.py#L210): ``` if "final_info" in infos: for info in infos["final_info"]: if info and "episode" in info: print(f"global_step={global_step}, episodic_return={info['episode']['r']}") writer.add_scalar("charts/episodic_return", info["episode"]["r"], global_step)...