machine-learning-book
machine-learning-book copied to clipboard
Ch 8, applying preprocessor to dataframe type-error
df['review'] = df['review'].apply(preprocessor)
Leads to a type-error: TypeError: expected string or bytes-like object, got 'float'
With small modification, I got the code to work, by making sure the text is really a string by enforcing it with str(). Not sure if this is the proper way of doing it, but it works for me.
Here is the modified preprocessor that is executed without errors:
def preprocessor(text):`
text = re.sub('<[^>]*>', '', str(text)) #here I use the str() function
emoticons = re.findall(r'(?::|;|=)(?:-)?(?:\)|\(|D|P)',
text)
text = (re.sub(r'[\W]+', ' ', text.lower()) +
' '.join(emoticons).replace('-', ''))
return text