ZGY

Results 19 comments of ZGY

I reproduced the error. This is because XLNET uses the second dimension to represent batch, while most other models use the first dimension. XLNET: `(seq_length, batch_size, hidden_size)` OTHER: `(batch_size, seq_length,...

BMinf will request 512MB of memory before loading the model. From your screenshot, it seems that the error is happening here. I'm going to spend some time trying to reproduce...

@sdjksdafji I ran the examples with my GTX 1070 on Windows. Everything turned out fine. Could it be that the conda environment is causing some effects? Also, ave you tried...

> @a710128 I tried to import all 3 models. Surprisingly, CPM1 is fine. It started downloading after `model = bminf.models.CPM1()`. However, CPM2 and EVA reported the same CUDA OOM error....

I'm confused that importing CPM1 and importing CPM2 will run almost the same code. But importing CPM2 gives an error at line 55. https://github.com/OpenBMB/BMInf/blob/45d0af959f8017ca78bc18e03a660daf77c46852/bminf/models/cpm2.py#L55 CPM1 https://github.com/OpenBMB/BMInf/blob/45d0af959f8017ca78bc18e03a660daf77c46852/bminf/models/cpm1.py#L26-L51 CPM2 https://github.com/OpenBMB/BMInf/blob/45d0af959f8017ca78bc18e03a660daf77c46852/bminf/models/cpm2.py#L31-L56

> @a710128 Could you share your installation script and cuda version? `pip install bminf` CUDA 11.1

`Seems like the actual fix here is to use the non-cuda pinned numpy array if the cuda malloc operation fails. Even if it worked, would it affect the inference performance?I...

问答需要额外的训练,目前不支持。

1. 主要还是pytorch没法用自己编写的allocator,所以为了更好的在低资源环境下跑起来,使用了cupy进行手动管理。 2. cupy的 reduce routine 分别是 pre-reduce mapping, reduce, post-reduce mapping