pyctcdecode icon indicating copy to clipboard operation
pyctcdecode copied to clipboard

What does score_partial_token of HotwordsScorer do ?

Open bruno-hays opened this issue 3 years ago • 0 comments

I'm looking at modifying the base HotwordsScorer to boost short sentences instead of just individual words. But I fail to understand what the score_partial_token function does. The comment in the code seems to have been copy pasted from the score function and does not help:

def score(self, text: str) -> float:
    """Get total hotword score for input text."""
    return self._weight * len(self._match_ptn.findall(text))

def score_partial_token(self, token: str) -> float:
    """Get total hotword score for input text."""
    if token in self:
        # find shortest unigram starting with the given partial token
        min_len = len(next(self._char_trie.iterkeys(token, shallow=True)))
        # scale score by length of unigram matched so far
        score = self._weight * len(token) / min_len
    else:
        score = 0.0
    return score

bruno-hays avatar Jun 06 '22 14:06 bruno-hays