ELMoForManyLangs 训练数据的数量不对

您好，我训练用的命令是 python -m elmoformanylangs.biLM train --train_path /home/data/peter/intent_classification/c2_intent_ltp_sent.txt --config_path /home/data/peter/ELMoForManyLangs/pretrained_model/configs/cnn_50_100_512_4096_sample.json --model /home/data/peter/ELMoForManyLangs/elmo_train_model

然后我的训练数据是64629，但是我看到实际ELMo训练的时候只拿了9千多条。

2019-01-28 12:48:07,723 INFO: training instance: 9376, training tokens: 459418.
2019-01-28 12:48:07,785 INFO: Truncated word count: 0.
2019-01-28 12:48:07,785 INFO: Original vocabulary size: 14878.
2019-01-28 12:48:07,919 INFO: Word embedding size: 14880
2019-01-28 12:48:08,145 INFO: Char embedding size: 3292
2019-01-28 12:48:24,806 INFO: 293 batches, avg len: 50.0
2019-01-28 12:48:24,807 INFO: Evaluate every 293 batches.
2019-01-28 12:48:24,807 INFO: vocab size: 14880

Jan 28 '19 06:01 OswaldoBornemann

我明白了。我看了一下代码，好像这个training instance是按照那个max_sent_len来决定的？

Jan 28 '19 06:01 OswaldoBornemann

@tsungruihon ，请问你在训练时，前期数据预处理用了多长时间，是不是占用内存比较大，我训练的命令是： python -m elmoformanylangs.biLM train
--train_path comm_data/seg_word.all
--config_path configs/cnn_50_100_512_4096_sample.json
--model comment_output/v1
--optimizer adam
--lr 0.001
--lr_decay 0.8
--max_epoch 10
--max_sent_len 50
--max_vocab_size 200000
--min_count 10
--gpu 3
--batch_size 64
指定了使用gpu，但是查看显卡使用情况，却并没有使用gpu。请问你是gpu训练的吗？

Mar 19 '19 03:03 xuehui0725

嗯嗯是的我是用GPU进行训练的。

XueHui [email protected] 于2019年3月19日周二上午11:41写道：

python -m elmoformanylangs.biLM train --train_path comm_data/seg_word.all --config_path configs/cnn_50_100_512_4096_sample.json --model comment_output/v1 --optimizer adam --lr 0.001 --lr_decay 0.8 --max_epoch 10 --max_sent_len 50 --max_vocab_size 200000 --min_count 10 --gpu 3 --batch_size 64 指定了使用gpu，但是查看显卡使用情况，却并没有使用gpu。请问你是gpu训练的吗？

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HIT-SCIR/ELMoForManyLangs/issues/44#issuecomment-474186736, or mute the thread https://github.com/notifications/unsubscribe-auth/ALFfC18mRS-3wDHfEMHQRnIdEOBFZX9nks5vYFx4gaJpZM4aVQQw .

Mar 19 '19 06:03 OswaldoBornemann

@xuehui0725

前期数据预处理用了多长时间

在2.4ghz主频的cpu上，1b word benchmark大概会用一个半小时，并且会吃很多内存。预处理做完后才开始用GPU…

Mar 19 '19 16:03 Oneplus