bricks icon indicating copy to clipboard operation
bricks copied to clipboard

[MODULE] - Lexical diversity

Open LeonardPuettmannKern opened this issue 2 years ago • 0 comments

Please describe the module you would like to add to bricks Super easy but great indicator for the quality of a text. Can also be used for Cognition.

Do you already have an implementation?

 def lexical_divesity(text):
    word_count = len(text)
    vocab_size = len(set(text))
    return word_count / vocab_size # this is the diversity score

Additional context Found here: https://btw.informatik.uni-rostock.de/download/workshopband/C2-5.pdf The actual implementation in the paper is not correct. The correct implementation and many more useful snippets can be found in the book "Natural Language Processing with Python".

LeonardPuettmannKern avatar Nov 19 '23 12:11 LeonardPuettmannKern