Leon Song

Results 14 comments of Leon Song

Same issue, using PyTorch 1.1.0

> Same issue, using PyTorch 1.1.0 Change PyTorch version to 1.0.1 may solve the NaN loss issue.

Same issue needs help (MDAnaylsis == 2.3.0)

Same issues. Cannot reproduce the infilling results as paper reported, a bit lower. Any ideas?

> Dear @shivamag125 , @timxx and @stgzr, thanks for reporting! > > @timxx : The instruction models are not intended to be used for infilling, please use the pretrained models....

Any update to this issue? It still happens in `pytorch==1.14.0a0+410ce96`

I suppose there is a bug in the gradient accumulation implementation. If the model is wrapped by a `DistributedDataParallel` module, when calling `backward`, the gradient should be averaged across GPUs....

Same issue, > F tensorflow/stream_executor/cuda/cuda_driver.cc:316] current context was not created by the StreamExecutor cuda_driver API: 0x42aa310; a CUDA runtime call was likely performed without using a StreamExecutor context Aborted (core...