"'float' object has no attribute encode" when trying to get 6 most frequent words for a cluster

Open lena2813 opened this issue 9 years ago • 1 comments

Hi Brandon, Thank you so very much for this tutorial. It is helping me a lot. I'd like to ask you about the following line of code: print(' %s' % frame.ix[terms[ind].split(' ')].values.tolist()[0][0].encode('utf-8', 'ignore'), end=',') When I run it, the compiler throws this error: "AttributeError: 'float' object has no attribute 'encode'."

I'm working with Python2.7, by the way. My tokenized list of words looks like this: norms = [u'jamie', u'johnson', u'sword', u'middle']. dic = {'id': ids, 'norm': norms, 'cause': causes, 'cluster': clusters} frame = pd.DataFrame(dic, index = [clusters] , columns = ['id', 'norm', 'cause']

I tried this line <<< frame.ix[terms[ind].split(' ')].values.tolist()[0][0, end='' >> (i.e. without the encoding part), but it gives me NaN for each value of the 6 most frequent words. And <<frame.ix[terms[ind].split(' ')].values.tolist()[0][1]>> And converting it to str: <<<frame.ix[str(terms[ind]).split(' ')].values). Also <<<<import sys; reload(sys); sys.setdefaultencoding("utf-8")>>>. These were probably pointless things to do... since << frame.ix[terms[ind].split(' ')].values>>> is a float object. I don`t understand this line. Do you know, by any chance, a good tutorial for pandas that might explain indexing and sorting on clusters for me or how to deal with this "float object has no attribute encode" situation?

Thank you so much for your reply! And have a great day.

Aug 10 '16 14:08 lena2813

Never mind! I figured it out! Have a great day and thanks so much for this awesome tutorial.

Aug 11 '16 03:08 lena2813