Lu, Chengjun
Lu, Chengjun
There is another SIGSEGV issue in the PyDev with this fix. And it should be fixed in newer PyDev with this PR. https://github.com/fabioz/PyDev.Debugger/pull/186/files A temp work around in PyTorch is...
Hi @Zha0q1, I cannot reproduce the issue of "the destructor of ProcessGroupCCL was not correctly called" The ~ProcessGroupCCL can always be called on the end of the python life for...
I am using the public pytorch v1.10.0-rc3 tag for the 1.10 release. Would you help to double check whether this issue could be reproduced without your changes?
Let's try more experiment: 1. Add some debug information in the destructor on ProcessGroup. 2. Can you show the ABI of the pytorch in your platform `torch._C._GLIBCXX_USE_CXX11_ABI`?
> 1. Do you mean the Pytorch ProcessGroup? Yes. > 2. it shows True > One more question: did you try the same script I used? Yes.
> Sure I will do more experiments on Monday. Do you have any insights as to what might be the issue? It is bizarre issue. I don't have the strong...
This is because the torch_ccl cannot locate the torch installation on your setup. Can you try to install the torch explicitly and try again?
@ddkalamk Thanks for the information. I will try to check the install issue with the latest conda package
Hi Peach-He, Thanks for raise the regression issue you found. We are investigating this issue.
@zhongyuansh It seems a c++ library compatible issue. Which version of the torch_ccl are you using?