如何解决'numpy.float64'对象在python 3中没有属性'encode'

时间:2017-07-04 05:53:37

标签: python numpy tweepy sentiment-analysis

我正在尝试在Twitter上对不同的汽车品牌进行情感分析,我正在使用python 3。虽然运行代码我得到以下异常

Traceback (most recent call last):
File "C:\Users\Jeet Chatterjee\NLP\Maruti_Toyota_Marcedes_Brand_analysis.py", line 55, in <module>
x = str(x.encode('utf-8','ignore'),errors ='ignore')
AttributeError: 'numpy.float64' object has no attribute 'encode'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Jeet Chatterjee\NLP\Maruti_Toyota_Marcedes_Brand_analysis.py", line 62, in <module>
tweets.set_value(idx,column,'')
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\pandas\core\frame.py", line 1856, in set_value
engine.set_value(series._values, index, value)
 File "pandas\_libs\index.pyx", line 116, in pandas._libs.index.IndexEngine.set_value (pandas\_libs\index.c:4690)
File "pandas\_libs\index.pyx", line 130, in pandas._libs.index.IndexEngine.set_value (pandas\_libs\index.c:4578)
File "pandas\_libs\src\util.pxd", line 101, in util.set_value_at (pandas\_libs\index.c:21043)
  File "pandas\_libs\src\util.pxd", line 93, in util.set_value_at_unsafe (pandas\_libs\index.c:20964)
 ValueError: could not convert string to float: 

我不知道如何在python 3中表示编码。这是我的代码

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from textblob import TextBlob
import json
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#regular expression in python
import re

#data corpus
tweets_data_path = 'carData.txt'
tweets_data = []
tweets_file = open(tweets_data_path, "r")
for line in tweets_file:
    try:
        tweet = json.loads(line)
        tweets_data.append(tweet)
    except:
        continue
#creating panda dataset        
tweets = pd.DataFrame()
index = 0
    for num, line in enumerate(tweets_data):
  try:

     print (num,line['text'])

     tweets.loc[index,'text'] = line['text']
     index = index + 1 
  except:
         print(num, "line not parsed")
         continue

   def brand_in_tweet(brand, tweet):
       brand = brand.lower()
       tweet = tweet.lower()
       match = re.search(brand, tweet)
       if match:
        print ('Match Found')
        return brand
    else:
        print ('Match not found')
        return 'none'
for index, row in tweets.iterrows():
temp = TextBlob(row['text'])
tweets.loc[index,'sentscore'] = temp.sentiment.polarity

  for column in tweets.columns:
  for idx in tweets[column].index:
    x = tweets.get_value(idx,column)
    try:
        x = str(x.encode('utf-8','ignore'),errors ='ignore')          
        if type(x) == unicode:
            str(str(x),errors='ignore')
        else: 
            df.set_value(idx,column,x)
    except Exception:
        print ('encoding error: {0} {1}'.format(idx,column))
        tweets.set_value(idx,column,'')
        continue
tweets.to_csv('tweets_export.csv')

if __name__=='__main__':

  brand_in_tweet()

我已经发布了完整的代码,我没有得到任何关于这个错误的线索,如何解决这个问题。请提前帮助和谢谢。

2 个答案:

答案 0 :(得分:1)

这一行存在问题:

 x = str(x.encode('utf-8','ignore'),errors ='ignore')  

xnumpy.float64。代码试图首先将其编码为utf8,然后将其转换为字符串。但这是错误的方法,因为只能对字符串进行编码。首先将其转换为字符串,然后对字符串进行编码:

 x = str(x).encode('utf-8','ignore')

答案 1 :(得分:0)

问题:

  

TypeError:只能将str(而不是“ numpy.float64”)连接到str

解决方案:

例如:print("Accuracy" + str(accuracy_score(y_test, y_pred)))

使用str()连接int64string