How to select only those whose frequency is above a threshod?

Open softhuafei opened this issue 7 years ago • 0 comments

Hi, in your paper:

We consider all n-gram prefixes and suffixes of words in our training corpus, and select only those whose frequency is above a threshold, T , as frequent prefixes and suffixes should be more likely to behave like true morphemes of a language.

and I am reading your code about build_data, but I don't find any code to filter affix by frequency.

May I ask How do you perform the affix filtering operation?

Looking forward to your reply. ：D

Apr 08 '19 12:04 softhuafei