Results 9 comments of Joan

Interesting question, Does your code work with smaller chunks of text like the ones used for training? I just want to make sure your code works. Recurrent layers tend to...

Good idea to use the unnormalised logits before the softmax 👍

Ohhh I see, you're running on the demo branch right? If I remember rightly, I copied all the flags in basic/cli.py and replaced all the flags in basic/demo_cli.py for the...

Yeah! The demo only works with text from the dataset. If you look into the code, the client is only sending the paragraph_id instead of the text itself so even...

@avostryakov Are you aware of any implementation of residual connections?

Hi there, I'm getting the same OOM error and running the code on a AWS p3.8xlarge instance 4x16GB. I guess I get the OOM because the model doesn't fit in...

Did you use "--gpus 0," ?

@Jesparzarom What's the output of this? ``` import torch if __name__ == "__main__": print("Cuda support:", torch.cuda.is_available(),":", torch.cuda.device_count(), "devices") ```