negate_sequence() function buggy.
The problems that I see are in:
if any(neg in word for neg in ["not", "n't", "no"]):
negation = not negation
1st problem: negate_sequence() negates words after word containing substring "no", "not", "n't"
For example,
negate_sequence("I know it's going to be nice today")
returns:
['i', 'know', 'i know', "not_it's", "know not_it's", "i know not_it's", 'not_nice', "not_it's not_nice", "know not_it's not_nice", 'not_today', 'not_nice not_today', "not_it's not_nice not_today"]
due to the fact that "know" contains the substring "no". "n't" is not really a problem because it usually comes at the end of a word, but matching on "not" presents issues as well.
2nd problem: You should be comparing with the .lower() version of the word.
For example,
negate_sequence("I DON'T like this movie")
will not negate anything because it's only checking for "n't", not "N'T" or any other case variation. Same thing for a text like "No one with half a brain would watch this movie more than once", because the "No" doesn't match "no".
@Sm1th Can you submit a pull request for the these problems? Thanks for the heads up on problem #2, for now I'll lower my input to the service.
@bfdill @vivekn Any chance this pull request will get merged?
@sm1th Any idea on how to generate the pickle file trained see and countdata and reduceddata.pickle see
Please help in running this code. :pray:
Kindly, guide.
@mitend I'm sorry I can't help you. Consider opening another issue.