我使用tweepy和python根据某些关键字收集推文,然后将这些状态更新(推文)写入CSV文件。我不认为自己是程序员,我真的迷失了。
这是错误:
> Traceback (most recent call last):
File "./combined-tweepy.py", line 58, in <module>
sapi.filter(track=[topics])
File "/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py", line 286, in filter
encoded_track = [s.encode(encoding) for s in track]
AttributeError: 'tuple' object has no attribute 'encode'
这是脚本:
#!/usr/bin/python
import sys
import re
import tweepy
import codecs
import datetime
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
# Create a list of topics
with open('termList.txt', 'r') as f:
topics = [line.strip() for line in f]
stamp = datetime.datetime.now().strftime('%Y-%m-%d-%H%M%S')
topicFile = open(stamp + '.csv', 'w+')
sapi = tweepy.streaming.Stream(auth, CustomStreamListener(topicFile))
sapi.filter(track=[topics])
class CustomStreamListener(tweepy.StreamListener):
def __init__(self, output_file, api=None):
super(CustomStreamListener, self).__init__()
self.num_tweets = 0
self.output_file = output_file
def on_status(self, status):
### Writes one tweet per line in the CSV file
cleaned = status.text.replace('\'','').replace('&','').replace('>','').replace(',','').replace("\n",'')
self.num_tweets = self.num_tweets + 1
if self.num_tweets < 500:
self.output_file.write(status.user.location.encode("UTF-8") + ',' + cleaned.encode("UTF-8") + "\n")
print ("capturing tweet from list")
# print status.user.location
return True
else:
return False
sys.exit("terminating")
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True # Don't kill the stream
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True #Don't kill the stream
f.close()
答案 0 :(得分:1)
根据Python的文档,这里是definition of a tuple。似乎主题中的一个词是元组。
我看到其他一些小错误。首先,你编写代码的方式,你应该在定义它们之后调用它们。例如,这两行
sapi = tweepy.streaming.Stream(auth, CustomStreamListener(topicFile))
sapi.filter(track=[topics])
应该在您定义了
中的所有功能之后class CustomStreamListener(tweepy.StreamListener):
此外,没有必要将主题放在大括号中
sapi.filter(track=[topics])
因为它已经是根据这一行的列表
topics = [line.strip() for line in f]
您能告诉我们termList.txt的内容吗?