Pedro Vitor Quinta de Castro
Pedro Vitor Quinta de Castro
@ngoanpv and @wxp16, I did a training for a v3-8 TPU and a Tesla V100 from a DGX-1, using batch sizes 512 and 24 (couldn't fit 32). What amazes me...
> @peregilk I really run with batch size 512. > > @stefan-it I run with python module > > ```python > import sentencepiece as spm > spm.SentencePieceTrainer.Train('--input= --vocab_size=30000 --model_prefix=prefix_name --pad_id=0...
Ran into the same problem, I'm using your solution too.
For argument mining there is a strong research group at Darmstadt: https://www.informatik.tu-darmstadt.de/ukp/research_6/research_areas/argumentation_mining/index.en.jsp They keep some annotated datasets for benchmark under the "Resources" section in their website above.
Sorry, missed the results folder
Thanks @plkmo ! Are you uploading an updated one?
  I'm reopening since we're still discussing this :sweat_smile: I got these losses. Do you think they are ok?
@plkmo From what I could see, you weren't able to get good results from MTB using cnn either, right? I did a pretraining and applied it on the task afterwards,...
Strange @mickeystroller , I'm doing the exact same thing as you are, but take a look at my results 
Same problem. Can you tell me the version of your tokenizers package? ` pip show tokenizers`