natural icon indicating copy to clipboard operation
natural copied to clipboard

How to turn whole sentence into singular?

Open binarykitchen opened this issue 11 years ago • 7 comments

Hello again

I'd like to turn all words of a sentence into singular.

For example my dog has lots of flees should become [ 'my', 'dog', 'has', 'lots', 'of', 'flee' ]

Here the code:

    var tokenizer = new natural.WordTokenizer();
    var words = tokenizer.tokenize(sentence);
    var nounInflector = new natural.NounInflector();

    for (var index in words) {
        var word = words[index];

        words[index] = nounInflector.singularize(word);
    }

    console.log(words);

which outputs:

[ 'my', 'dog', 'ha', 'lots', 'of', 'flee' ]

Almost correct. But why ha? And not has?

binarykitchen avatar Mar 10 '14 11:03 binarykitchen

PS: I encounter similar issues when I try to turn a whole sentence in past tense into present with the PresentVerbInflector.

binarykitchen avatar Mar 10 '14 11:03 binarykitchen

I think the problem is that has is a verb and you are using the noun inflector. You could probably use wordnet to get the POS (don't have our own tagger yet see #117).

The inflectors work off of a set of rules in most cases, so if you give the noun inflector a verb it will likely treat it is a noun and apply the rules (getting ha instead of `has)

kkoch986 avatar Mar 17 '14 20:03 kkoch986

Thanks @kkoch986

Hmmm.... what are wordnet and POS?

Is there a function which tells me if the word is a noun or a verb?

binarykitchen avatar Mar 18 '14 06:03 binarykitchen

Sorry, could have been more clear. POS stands for part-of-speech meaning is it a verb, noun, adjective etc...

Wordnet is a database of english words and it contains a lot of useful information about them see http://wordnet.princeton.edu/ and here for more on that.

Once you've configured natural to use wordnet (as per the second link above) you can get the parts of speech by doing something like this:

var wordnet = new natural.WordNet();
wordnet.lookup('has', function(results) {
    results.forEach(function(result) {
        console.log(result.pos);
    });
});

Let me know if that helps, I think a good POS tagger (something that takes a word and returns its part of speech) or a good sentence parser (something that takes a sentence and gives some information about what words make up what parts of the sentence structure) are important additions to natural. Hopefully both will be coming soon.

-Ken

kkoch986 avatar Mar 18 '14 13:03 kkoch986

Thanks @kkoch986 - I will give this a try but first, let me ask you a couple of questions:

  • It would be cool if the above lookup method also accepts an array of words, i.E. a whole sentence!
  • Also, it would be awesome if the returned results in the callback also come with the correct inflectors! (without the need to bloat up the code with if-verb-then-create-verb-inflector etc.)
  • Basically, my goal is to be able to translate a whole sentence into present, singular etc.

I need all that for a mad experiment. To interpret English sentences into sign language ;)

binarykitchen avatar Mar 22 '14 01:03 binarykitchen

@binarykitchen I love both the lookup feature enhancement and the "mad experiment" keep me posted on your progress and let me know if i can help in some way. I'll try to get to working on that lookup feature but its not quite at the top of my list right now. I would be more than happy to merge it in if you arrive at the solution before i do.

  • Ken

kkoch986 avatar Mar 24 '14 14:03 kkoch986

@kkoch986 Thanks Ken! Go ahead, there is no rush. Whenever you have enhanced the code, I will continue with my mad experiment.

binarykitchen avatar Mar 24 '14 23:03 binarykitchen