我一直在使用Tweepy通过流媒体API收集某个区域的推文,而我一直只是收集推文的纬度/经度但是我想添加更多内容而且我不确定是什么具体是。我正在使用这段代码来获取lat / long值:
import json, tweepy
from html.parser import HTMLParser
consumer_key = ""
consumer_secret = ""
access_token = ""
access_secret = ""
count = 0
class StdOutListener(tweepy.StreamListener):
def on_data(self, data):
global count
decoded = json.loads(HTMLParser().unescape(data))
if decoded.get('coordinates',None) is not None:
coordinates = decoded.get('coordinates','').get('coordinates','')
name = decoded.get('name','')
with open("C:\\Users\\gchre\\Desktop\\Tweets.txt", "a") as text_file:
print(decoded['coordinates'], file=text_file)
print(decoded['coordinates'])
count += 1
return True
def on_error(self, status):
print(status)
l = StdOutListener()
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
stream = tweepy.Stream(auth, l)
while count < 1000000:
stream.filter(locations=[-88.853859,41.220047,-86.953073,42.758134])
我希望这也能在文本文件中打印出特定的用户名(@handle)和创建Tweet的时间。我不确定我是否应该在if decoded.get('coordinates',None) is not None:
循环中执行此操作。
答案 0 :(得分:2)
对于那些感兴趣的人,我想出来了,在if decoded.get()
循环中,我添加了以下内容:
user = decoded.get('user','').get('screen_name','')
date = decoded.get('created_at','')
然后在打印行中我添加了值:
print((decoded['coordinates'], user, date), file=text_file)
答案 1 :(得分:1)
我认为您需要阅读Twitter Dev中的文档才能理解tweet的数据结构。
感谢。