这可能是一个超级愚蠢的错误,但我只是看不出有什么不对。
class listener(tweepy.streaming.StreamListener):
def on_data(self, data):
tweet = data.split(',"text":"')[1].split('","source')[0]
screen_name = data.split(',"screen_name":"')[1].split('","location":')[0]
print tweet
print data
return True
def on_error(self, status):
print status
def main():
twitterStream = tweepy.Stream(auth, listener())
twitterStream.userstream()
if __name__ == "__main__":
main()
,错误是:
Traceback (most recent call last):
File "C:\Rex\702_EH\new 1.py", line 35, in <module>
main()
File "C:\Rex\702_EH\new 1.py", line 32, in main
twitterStream.userstream()
File "build\bdist.win32\egg\tweepy\streaming.py", line 394, in userstream
File "build\bdist.win32\egg\tweepy\streaming.py", line 361, in _start
File "build\bdist.win32\egg\tweepy\streaming.py", line 294, in _run
IndexError: list index out of range
有人可以帮我这个吗?
答案 0 :(得分:0)
您获得的推文采用JSON格式,您应该利用它而不是尝试将它们解析为纯文本。这些属性将更容易提取,您的代码也将更具可读性。
class listener(tweepy.streaming.StreamListener):
def on_data(self, data):
decoded = json.loads(data)
tweet = decoded['text']
screen_name = decoded['user']['screen_name']
print tweet
print data
return True
def on_error(self, status):
print status
def main():
twitterStream = tweepy.Stream(auth, listener())
twitterStream.userstream()
if __name__ == "__main__":
main()
作为旁注,我建议你切换到Python3,在Python2中处理Unicode可能会非常噩梦。
答案 1 :(得分:0)
Tweepy响应的输出是JSON。因为JSON是应用程序之间内部通信的标准,所以你应该在python中使用json lib来遵循这个标准。所以你需要像这样加载Tweepy响应:
tweet = json.loads(data)
username = tweet[user][screeen_name]
language = tweet[user][lang]
......
.....