dayL-W

Results 6 issues of dayL-W

Just as you remind "Therefore, in a particular batch, some tasks might not be sampled, and their loss could be 0 in this batch." Now I get the error: tensorflow.python.training.basic_session_run_hooks.NanLossDuringTrainingError:...

I could not visit 192.168.4.1 when i connect to "Pixracer" WiFi. I ensure that my computer has connect the "Pixracer" and get IP "192.168.4.2". I use the firmware build from...

During the training,i found that the predictive value is always the same and the accuracy will not improve.why?

I think there may be some problems with network optimization,Because loss doesn't go down and evaluate accuracy always 0.

### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答? | Is there an...

我惊讶于1000B的模型使用15.5T的数据量就够了,参数:Token数量只有1:15,使用率很高!所以请问15.5T包含改写后的数据吗? * 如果不包含,确实挺牛的 * 如果包含,那么改写后的加上原始数据一共多少呢?