porter2 icon indicating copy to clipboard operation
porter2 copied to clipboard

Step 2, "li" ending - should check if the ending is in R1

Open neilt1700 opened this issue 6 years ago • 0 comments

I ran the code on all 29418 stem example given at https://snowballstem.org/algorithms/english/stemmer.html (Sample English vocabulary, Its stemmed equivalent). It only failed on one :-) - "freely" stemmed to "free", it should stem to "freeli".

In step2 there should be a check to see if the ending is in R1 when the ending is "li" (as well as for all the other endings). So:

    if (self::hasEnding($word, 'li')) {
      if (strlen($word) > 4 && self::validLi(self::charAt(-3, $word))) {
        $word = self::removeEnding($word, 'li');
      }
    }

should be replaced with:

    if (self::inR1($word, "li") && self::hasEnding($word, 'li')) {
      if (strlen($word) > 4 && self::validLi(self::charAt(-3, $word))) {
        $word = self::removeEnding($word, 'li');
      }
    }

neilt1700 avatar Jan 13 '20 17:01 neilt1700