Renato Violin
Renato Violin
@shawei3000 In this file https://github.com/renatoviolin/xlnet/blob/master/model_utils_GPU.py line 142: variables = variables[-177:] I get all trainable vars, and re-assign only the last 177 vars. The print() on following lines outputs all the...
Hi @yangapku In this file https://github.com/renatoviolin/xlnet/blob/master/model_utils_GPU.py , lines 147 - 166 is the code to use FP-16. It is already hard-coded as TRUE in the line 147.
@yangapku I don't did it yet. In my tests I keep these parameters unchanged to keep track of the improvements I could do in the network without modify the parameters...
Hi @yangapku I'm using a single RTX 2080 CUDA 10, CUDNN 7.5, Python 3.7, TF 1.13.1 I don't have experience with multi-gpu. I could take a look at https://github.com/horovod/horovod. It...
Hi @MaXXXXfeng, have you used the original code before? It seems that you have problems with the paths for the files. Check the script gpu_squad_base_GPU.sh for the path variables. Adjust...
Hi, @MaXXXXfeng good point about max_seq_length. Any time we change the max_seq_length we have do delete all files of "PROC_DATA_DIR" so that they are recreated with the right length.
Hi @harirajeev Sometimes I got results like yours (but I don't remember what I did to solve, because I did so many things). Here is my [simplified code folder](https://drive.google.com/file/d/1d_XM-mQFIQRlQMO50Io6u6UDRfbABYEi/view?usp=sharing) that...
Hi Hari, No, use_bfloat16 = False (line 59, file run_squad_3.py). I use float16 directly in the code (lines 147-150, file model_utils_3.py). I think bfloat is specific for TPU usage. Regards...
Hi @harirajeev I tried to fine tune BERT on Trivia-QA, but didn't got good results in that case. With XLNET I haven't tried it yet.
Same issue here. Here's how a solved: 1. logout the coursera main page, 2. login using the link https://learner.coursera.help/hc 3. Tried coursera-dl and now it is working.