python-crfsuite icon indicating copy to clipboard operation
python-crfsuite copied to clipboard

Trouble opening info after loading the data from model

Open Alexis-benoist opened this issue 11 years ago • 4 comments

Hello,

I think that the library is great but I have a problem to open the info. I have already trained the CRF and I want to open the model to see the features. I do:

tagger = pycrfsuite.Tagger()
tagger.open('crf')
info = tagger.info()

And I got the following traceback:

Traceback (most recent call last):
  File "/Applications/PyCharm CE.app/helpers/pydev/pydevd.py", line 1733, in <module>
    debugger.run(setup['file'], None, None)
  File "/Applications/PyCharm CE.app/helpers/pydev/pydevd.py", line 1226, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Users/alexisbenoist/Documents/python/papyrus/learn.py", line 197, in <module>
    learn()
  File "/Users/alexisbenoist/Documents/python/papyrus/learn.py", line 169, in learn
    crf.print_most_common_features()
  File "/Users/alexisbenoist/Documents/python/papyrus/extraction/extractor.py", line 219, in print_most_common_features
    for (attr, label), weight in self.most_common_features.most_common(nb):
  File "/Users/alexisbenoist/Documents/python/papyrus/extraction/extractor.py", line 215, in most_common_features
    info = tagger.info()
  File "pycrfsuite/_pycrfsuite.pyx", line 704, in pycrfsuite._pycrfsuite.Tagger.info (pycrfsuite/_pycrfsuite.cpp:8649)
  File "pycrfsuite/_pycrfsuite.pyx", line 706, in pycrfsuite._pycrfsuite.Tagger.info (pycrfsuite/_pycrfsuite.cpp:8590)
  File "/Users/alexisbenoist/Documents/python/papyrus/env/lib/python2.7/site-packages/pycrfsuite/_dumpparser.py", line 61, in feed
    getattr(self, 'parse_%s' % self.state)(line)
  File "/Users/alexisbenoist/Documents/python/papyrus/env/lib/python2.7/site-packages/pycrfsuite/_dumpparser.py", line 69, in parse_LABELS
    self.result.labels[m.group(2)] = m.group(1)
AttributeError: 'NoneType' object has no attribute 'group'

I'm on python 2.7 on a mac.

Do you guys have any on what I'm doing wrong?

Any input is welcome. Thanks,

Alexis.

Alexis-benoist avatar Dec 02 '14 17:12 Alexis-benoist

.info() method is a hack, it parses logging output of crfsuite C++ library, and I suspect some label/observation values could break the parsing. A random guess - do you use empty labels or something like this?

kmike avatar Dec 02 '14 17:12 kmike

Yes I do empty labels. I will try to change that tomorrow!

Thanks for being so fast!

Alexis-benoist avatar Dec 02 '14 18:12 Alexis-benoist

You may try changing .+ to .* here and here - it could fix an issue with empty labels. Pull requests (with tests) are welcome :)

kmike avatar Dec 02 '14 18:12 kmike

Removing the underscores did it! thanks.

Though changing the library, didn't worked out.

Alexis-benoist avatar Dec 03 '14 08:12 Alexis-benoist