Pedro Vitor Quinta de Castro

Results 35 comments of Pedro Vitor Quinta de Castro

@ngoanpv and @wxp16, I did a training for a v3-8 TPU and a Tesla V100 from a DGX-1, using batch sizes 512 and 24 (couldn't fit 32). What amazes me...

> @peregilk I really run with batch size 512. > > @stefan-it I run with python module > > ```python > import sentencepiece as spm > spm.SentencePieceTrainer.Train('--input= --vocab_size=30000 --model_prefix=prefix_name --pad_id=0...

Ran into the same problem, I'm using your solution too.

For argument mining there is a strong research group at Darmstadt: https://www.informatik.tu-darmstadt.de/ukp/research_6/research_areas/argumentation_mining/index.en.jsp They keep some annotated datasets for benchmark under the "Resources" section in their website above.

Sorry, missed the results folder

Thanks @plkmo ! Are you uploading an updated one?

![loss_vs_epoch_0](https://user-images.githubusercontent.com/12713359/78023667-701ea300-732d-11ea-916b-70a233dac815.png) ![accuracy_vs_epoch_0](https://user-images.githubusercontent.com/12713359/78023672-7280fd00-732d-11ea-9672-60c30f7054da.png) I'm reopening since we're still discussing this :sweat_smile: I got these losses. Do you think they are ok?

@plkmo From what I could see, you weren't able to get good results from MTB using cnn either, right? I did a pretraining and applied it on the task afterwards,...

Strange @mickeystroller , I'm doing the exact same thing as you are, but take a look at my results ![image](https://user-images.githubusercontent.com/12713359/81348912-59d6d600-9095-11ea-9686-fe9025493bbf.png)

Same problem. Can you tell me the version of your tokenizers package? ` pip show tokenizers`