How to train new model?
As the tool supply the funcion of "NSFC Subject Classifier", which input is chinese. so how can i train the same model for english version? Q: A) What is the train data format? B) What is the definition of the level? C) What is the training tool?
so appreciation for your reply.
hi,let me answer your question.
(A)
We use abstracts of paper to train the model. The lable is NSFC uses a three-level depends on the classification of Natural Science Foundation of China(NSFC)
e.g: " A04 abstract_paper" ",when predic the level-1 subject. The predict for subject level-2, you should do like that "A040412 abstract_paper" .We trained three models, in order to predic Three Level Disciplines.
(B)
NSFC three-level depends on the classification of Natural Science Foundation of China(NSFC)
(C)
In this cases ,We use fasttext model as a classifier.
You can also replace it with other model that suits your work
in fact,the answer for your Q2,Q3 is in the classifier.py lied on "prediction_api/src/classifier.py". see the line 4 and line 25.
@wengenihaoshuai Do you mean the model trained for each level separatly?
Not one model, but three.We trained three models for three levels separatly. It is here that, the code at line25-27 in prediction_api/src/classifier.py is used for load three model.
@wengenihaoshuai Okay, Thanks a lot!
@wengenihaoshuai
We use abstracts of paper to train the model. The lable is NSFC uses a three-level depends on the classification of Natural Science Foundation of China(NSFC)
e.g: " A04 abstract_paper" ",when predic the level-1 subject. The predict for subject level-2, you should do like that "A040412 abstract_paper" .We trained three models, in order to predic Three Level Disciplines.
I download the AMiner paper data, which does not label subject category. How do I map the paper label to NSFC?