wangyan_sdu comments

Results 8 comments of


                                            wangyan_sdu

Asking Document of "dataset_untrimmed.py" for the Codes

Dear @escorciav , While I am trying to run the "moment_freq_prior.py" and "corpus_retrieval_2nd_eval.py", I found that they all "import dataset_untrimmed". Now that "dataset_untrimmed" is not a package of Python, I...

Asking Document of "dataset_untrimmed.py" for the Codes

Dear @escorciav : Thanks so much for your kind explanation and patience. I have already known the status of the code release. I sincerely hope that your paper will be...

Asking Document of "dataset_untrimmed.py" for the Codes

No problem! Thanks!

微调需要多大的显存？

我的batch_size调成1了，15G/张卡，共2张，为什么还是报CUDA out of memory呢

考虑加一下deepspeed吗？

同求，很需要模型并行

Expected size for first two dimensions of batch2 tensor to be: [1, 4] but got: [1, 512]. [BUG]

I encountered the above error while running example.py. Here are the versions of the relevant packages. How can I resolve this bug? pali3 0.0.7 zetascale 0.9.1 torch 2.0.1

集成多类损失函数的sft训练（如对比损失）

```python class AllGather(torch.autograd.Function): """An autograd function that performs allgather on a tensor.""" @staticmethod def forward(ctx, tensor, world_size, rank): output = [torch.empty_like(tensor) for _ in range(world_size)] torch.distributed.all_gather(output, tensor) ctx.rank = rank...

集成多类损失函数的sft训练（如对比损失）

py-spy 因为容器权限问题暂时无法使用，但是debug可知在forward过程中就卡住了；使用了zero2，debug配置为： ```json { "name": "debug_cl", "type": "debugpy", "request": "launch", "module": "torch.distributed.run", "console": "integratedTerminal", "justMyCode": false, "env": { "CUDA_VISIBLE_DEVICES": "0,1", "PYTHONPATH": "./" }, "args": [ "--master_port", "29510", "--nproc_per_node", "2", "swift/cli/main.py",...