Lei52
Lei52
@kwotsin Thank you for your response! This helps a lot! Besides, are you going to implement the visualization of spectral norm, weight norm of each layer in training process with...
A CUDA version of 9.0 or 9.2 works for this code. The problem encountered by CUDA10.0 remains unsolved. @Lemonqinnn @tooHotSpot
> > A CUDA version of 9.0 or 9.2 works for this code. The problem encountered by CUDA10.0 remains unsolved. @Lemonqinnn @tooHotSpot > > I tried to run the program...
hi @shizhediao Have you executed any of the codes in TPU version so far? If so, could you please give some logs on your experiments?
> It's worth noting there's another TPU implementation where they claim to have trained models successfully: https://github.com/giannisdaras/smyrf/tree/master/examples/tpu_biggan (supporting code for ["SMYRF: Efficient attention using asymmetric clustering", Daras et al 2020](https://arxiv.org/abs/2010.05315))....