Results 4 issues of Linsu Han

The following code below gives nan and inf values; am I using this incorrectly? seconds = np.arange(30) traffic_light = np.array([0]*15 + [1]*5 + [2]*10) brake_0 = np.array([0]*15 + [1, 2,...

What's the reasoning behind the extra dropout layer after projection? Karpathy's implementation has 2 dropout layers: 1. `attn_dropout` 2. `resid_dropout` Karpathy's 2nd dropout layer https://github.com/karpathy/nanoGPT/blob/eba36e84649f3c6d840a93092cb779a260544d08/model.py#L40 Torch's implementation only has 1...

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment:Python 3.12 - 版本号/Version - Paddle:2.6.1 - PaddleOCR:2.7.3 - PaddleNLP: 2.6.1 (also tried 2.8.0, 2.5.x, 2.7.x) - 问题相关组件/Related...

bug