wangyan_sdu
wangyan_sdu
Dear @escorciav , While I am trying to run the "moment_freq_prior.py" and "corpus_retrieval_2nd_eval.py", I found that they all "import dataset_untrimmed". Now that "dataset_untrimmed" is not a package of Python, I...
Dear @escorciav : Thanks so much for your kind explanation and patience. I have already known the status of the code release. I sincerely hope that your paper will be...
No problem! Thanks!
我的batch_size调成1了,15G/张卡,共2张,为什么还是报CUDA out of memory呢
同求,很需要模型并行
I encountered the above error while running example.py. Here are the versions of the relevant packages. How can I resolve this bug? pali3 0.0.7 zetascale 0.9.1 torch 2.0.1
```python class AllGather(torch.autograd.Function): """An autograd function that performs allgather on a tensor.""" @staticmethod def forward(ctx, tensor, world_size, rank): output = [torch.empty_like(tensor) for _ in range(world_size)] torch.distributed.all_gather(output, tensor) ctx.rank = rank...
py-spy 因为容器权限问题暂时无法使用,但是debug可知在forward过程中就卡住了;使用了zero2,debug配置为: ```json { "name": "debug_cl", "type": "debugpy", "request": "launch", "module": "torch.distributed.run", "console": "integratedTerminal", "justMyCode": false, "env": { "CUDA_VISIBLE_DEVICES": "0,1", "PYTHONPATH": "./" }, "args": [ "--master_port", "29510", "--nproc_per_node", "2", "swift/cli/main.py",...