machine-learning-book icon indicating copy to clipboard operation
machine-learning-book copied to clipboard

Ch 8, applying preprocessor to dataframe type-error

Open DanyaLearning opened this issue 1 year ago • 1 comments

df['review'] = df['review'].apply(preprocessor)

Leads to a type-error: TypeError: expected string or bytes-like object, got 'float'

With small modification, I got the code to work, by making sure the text is really a string by enforcing it with str(). Not sure if this is the proper way of doing it, but it works for me.

Here is the modified preprocessor that is executed without errors:

def preprocessor(text):`
    text = re.sub('<[^>]*>', '', str(text)) #here I use the str() function
    emoticons = re.findall(r'(?::|;|=)(?:-)?(?:\)|\(|D|P)',
                           text)
    text = (re.sub(r'[\W]+', ' ', text.lower()) +
            ' '.join(emoticons).replace('-', ''))
    return text

DanyaLearning avatar Jan 09 '25 14:01 DanyaLearning