对于一个大学项目,我尝试下载推文并进行分析。不幸的是,每次我运行大量推文的代码时,都会出现以下错误:
第19行的文件“ C:\ Users \ Pasca \ Anaconda3 \ lib \ encodings \ cp1252.py” 返回codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError:'charmap'编解码器无法在位置54编码字符'\ u039f':字符映射到
我尝试了各种各样的事情,但是总是收到相同的错误消息……这是我第一次使用Python,因此我还不是真正的专家。
非常感谢大家!
遵循我正在使用的代码:
for tweet in tweepy.Cursor(api.search,q="#thepurge-filter:retweets",count=1000,
lang="en",
since="2017-04-03").items(1000):
result = re.sub(r"http\S+", "", tweet.text)
result = re.sub(r"@\S+", "", result)
result = result.lower()
ps = PorterStemmer()
word_tokens = word_tokenize(result)
result = [ps.stem(word) for word in word_tokens if not word in stop_words and len(word)>3]
result = ' '.join(result)
senttweet = TextBlob(tweet.text)
if senttweet.sentiment.polarity < 0:
sentiment = "negative"
nt.append(result)
elif senttweet.sentiment.polarity == 0:
sentiment = "neutral"
neut.append(result)
else:
sentiment = "positive"
pt.append(result)
if result not in alltweets:
csvWriter.writerow([tweet.id, tweet.created_at, tweet.source, tweet.favorite_count, tweet.retweet_count, result.encode('UTF-8',errors='ignore'), tweet.text.encode('UTF-8',errors='ignore'), sentiment])
alltweets.append(result)
csvFile.close()