bricks
bricks copied to clipboard
[MODULE] - Lexical diversity
Please describe the module you would like to add to bricks Super easy but great indicator for the quality of a text. Can also be used for Cognition.
Do you already have an implementation?
def lexical_divesity(text):
word_count = len(text)
vocab_size = len(set(text))
return word_count / vocab_size # this is the diversity score
Additional context Found here: https://btw.informatik.uni-rostock.de/download/workshopband/C2-5.pdf The actual implementation in the paper is not correct. The correct implementation and many more useful snippets can be found in the book "Natural Language Processing with Python".