mbc1990
Results
2
issues of
mbc1990
WordTokenizer, WordPunctTokenizer, and TreebankWordTokenizer all have similar unusual behavior on accented (tilde-ed?) characters: ``` > var tokenizer = new natural.WordPunctTokenizer(); > tokenizer.tokenize('São Paulo'); [ 'S', 'ã', 'o', 'Paulo' ] >...
Help/Questions
Not sure if this is intentional or not: ``` var tokenizer = new natural.WordPunctTokenizer(); console.log(tokenizer.tokenize("Example sentence (with parenthetical expression).")); ``` outputs: ``` [ 'Example', 'sentence', ' (', 'with', 'parenthetical', 'expression',...
Help/Questions