wangyanhui666
wangyanhui666
i skip this wrapping step. If wrapping the dataloader, will cause error. https://github.com/mosaicml/streaming/issues/789#issuecomment-2405432617
In my code, if I use accelerate to wrap the dataloader again, it will cause a deadlock. I think this is because the streaming dataset is already split for each...
reference:https://huggingface.co/docs/accelerate/package_reference/torch_wrappers
> @wangyanhui666 so is training successful if you don't wrap the dataloader, as mentioned in some previous issues? yes, training successful. I use 1node 4gpus to train. not test multi...
I think this is a bug of pytorch. they are working on this to fix this bug. https://github.com/pytorch/pytorch/pull/138354#issue-2598184802 Before they fix we should use pytorch
i tried pytorch 2.4.1 also have this bug. so maybe disable the CuDNN attention backend in training code is a good solution.