Lu, Chengjun

Results 11 comments of Lu, Chengjun

There is another SIGSEGV issue in the PyDev with this fix. And it should be fixed in newer PyDev with this PR. https://github.com/fabioz/PyDev.Debugger/pull/186/files A temp work around in PyTorch is...

Hi @Zha0q1, I cannot reproduce the issue of "the destructor of ProcessGroupCCL was not correctly called" The ~ProcessGroupCCL can always be called on the end of the python life for...

I am using the public pytorch v1.10.0-rc3 tag for the 1.10 release. Would you help to double check whether this issue could be reproduced without your changes?

Let's try more experiment: 1. Add some debug information in the destructor on ProcessGroup. 2. Can you show the ABI of the pytorch in your platform `torch._C._GLIBCXX_USE_CXX11_ABI`?

> 1. Do you mean the Pytorch ProcessGroup? Yes. > 2. it shows True > One more question: did you try the same script I used? Yes.

> Sure I will do more experiments on Monday. Do you have any insights as to what might be the issue? It is bizarre issue. I don't have the strong...

This is because the torch_ccl cannot locate the torch installation on your setup. Can you try to install the torch explicitly and try again?

@ddkalamk Thanks for the information. I will try to check the install issue with the latest conda package

Hi Peach-He, Thanks for raise the regression issue you found. We are investigating this issue.

@zhongyuansh It seems a c++ library compatible issue. Which version of the torch_ccl are you using?