learning-nlp icon indicating copy to clipboard operation
learning-nlp copied to clipboard

第三章的HMM中维比特算法

Open Miaotxy opened this issue 7 years ago • 2 comments

分词的时候最后的两个状态只可能是s或者e啊,为什么源代码中还出现了m状态啥的。

Miaotxy avatar Apr 23 '19 14:04 Miaotxy

(prob, state) = max((V[len(obs) - 1][y], y) for y in 'ES')

应该是这样才对吧?

https://github.com/ustcdane/annotated_jieba/blob/master/jieba/finalseg/init.py

Miaotxy avatar Apr 23 '19 14:04 Miaotxy

如果分词为短句子的话没问题,但是一旦是一个段落,就会出错

Traceback (most recent call last): File "D:/code/python_test/test/start.py", line 11, in print(str(list(res))) File "D:/code/python_test/test\hmm.py", line 150, in cut prob, pos_list = self.viterbi(text, self.state_list, self.Pi_dic, self.A_dic, self.B_dic) File "D:/code/python_test/test\hmm.py", line 134, in viterbi for y0 in states if V[t - 1][y0] > 0]) ValueError: max() arg is an empty sequence

Lukasjame avatar Apr 24 '19 15:04 Lukasjame