Shehper issues

Results 5 issues of


                                            Shehper

why batch size = 480 instead of 512 as in the GPT-2 paper?

Hi! The batch size of nanoGPT is batch_size*gradient_accumulation_steps = 12*40 = 480. The batch size mentioned in the GPT-2 paper is 512. May I ask why nanoGPT was trained with...

error in section 6.2 of FraudDetection.ipynb

The code, as written, does not create equally distributed classes.

Compatibility with MPS backend

While running inference on my Mac with MacOS version 13.1, I received the following error: ``` RuntimeError: MPS does not support cumsum_out_mps op with int64 input. Support has been added...

Handling of truncated trajectories in AlphaZero training example

### Problem Description In examples/alphazero/train.py, we compute `value_mask` as follows: https://github.com/sotetsuk/pgx/blob/87278d2d6e677fd87248c457207b59cfa42e578d/examples/alphazero/train.py#L179 The purpose is to avoid updating the critic network on incomplete trajectories, as is evident by masking of value...

Logging of episodic returns in ppo implementations

In `ppo.py` and `ppo_atari.py`, episodic information is logged [as follows](https://github.com/vwxyzjn/cleanrl/blob/1ed80620842b4cdeb1edc07e12825dff18091da9/cleanrl/ppo.py#L210): ``` if "final_info" in infos: for info in infos["final_info"]: if info and "episode" in info: print(f"global_step={global_step}, episodic_return={info['episode']['r']}") writer.add_scalar("charts/episodic_return", info["episode"]["r"], global_step)...