UnicodeDecodeError:'ascii'-解决方法

时间:2019-06-29 19:15:16

标签: python-3.x

我收到此错误:

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-34-8cf38df798b5> in <module>()
      1 x =  df[['text']]
----> 2 x['subjectivity'] = df.text.apply(lambda x: TextBlob(str(unicode(df[['text']]))).sentiment.subjectivity)
      3 df.head()

/Users/keenek1/anaconda3/lib/python2.7/site-packages/pandas/core/series.pyc in apply(self, func, convert_dtype, args, **kwds)
   3589             else:
   3590                 values = self.astype(object).values
-> 3591                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   3592 
   3593         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

<ipython-input-34-8cf38df798b5> in <lambda>(x)
      1 x =  df[['text']]
----> 2 x['subjectivity'] = df.text.apply(lambda x: TextBlob(str(unicode(df[['text']]))).sentiment.subjectivity)
      3 df.head()

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 551: ordinal not in range(128)

我试图通过使用不同的编码定义来解决这个问题,甚至使用utf-8编码将数据帧写回到csv。

我该怎么办?

from textblob import TextBlob
import pandas as pd
path = 'Tweets.csv'
df = pd.read_csv(path, delimiter=',', header='infer')
df.to_csv('tweets_encoded.csv', encoding='utf-8')
df = pd.read_csv('tweets_encoded.csv', delimiter=',', header='infer', encoding='utf-8"')

我尝试使用chardet查找编码

rawdata=open('Tweets.csv',"r").read()
chardet.detect(rawdata)
{'confidence': 0.5471323391929904,
 'encoding': 'Windows-1254',
 'language': 'Turkish'}

运行时出现错误

x =  df[['text']]
x['subjectivity'] = df.text.apply(lambda x: TextBlob(str(df[['text']])).sentiment.subjectivity)
df.head()

0 个答案:

没有答案