pliers icon indicating copy to clipboard operation
pliers copied to clipboard

more flexibility in handling out of vocabulary words for lexical norms extractors

Open rbroc opened this issue 6 years ago • 0 comments

Probably low priority but for the record:

Dictionaries tend not to include inflected forms of verbs or nouns. For some lexical norms, it might make sense to assume that inflected and non-inflected forms have a similar score.

We might consider adding more flexibility to how to handle words that are not found in the dictionary, e.g. whether to preserve NAs, or to try finding a score for the lemmatised version of the word, or try to match with most orthographically similar word (if we can think of a meaningful metric for that).

rbroc avatar Jan 29 '20 20:01 rbroc