我正在使用Tweepy传输Twitter API并从特定区域获取所有推文。稍后我将分析这些推文中的特定内容。在分析扩展推文,常规推文和引号时,它非常有效。但是我注意到我无法分析转发。进一步查看之后,我发现我的文件不包含tweet ['retweeted'] ==“ True”。
我想知道以下内容:a)通过使用tweepy流式传输Twitter API创建的JSON文件不包含任何转发是正常的吗?我知道在流式转发时,转发计数器或收藏夹计数器将始终为0,因为这就是发布推文并通过tweepy流式传输时的状态-它也从未更新。对于我可能遗失的转推,是否有类似的东西?
如果在流式传输Twitter API时不进行任何转发是不正常的,那么b)我的Twitter API流媒体是否正确,或者我在做一些排除转发从文件中转发的事情?
然后c)我是否使用下面的代码正确检查JSON文件中转发的内容?也许我在使用的代码中以错误的方式调用了转发?
对于问题b),这是我用于流式处理Twitter API的代码:
from credentials import *
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
auth = tweepy.OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
# Creating an interface to the RESTful API
api = tweepy.API(auth, wait_on_rate_limit=True)
#Function to collect tweets
class save_results(tweepy.StreamListener):
def on_data(self, data):
with open('TweetFile.txt', 'a') as f: #that is to save the tweets
# to a file
f.write(data)
def on_error(self, status_code):
# Disconnect if the policy for number of attempted connections
#allowed is passed.
if status_code == 420:
return False
else:
return True
def on_exception(self, exception):
print(exception)
return True
twitter_stream = tweepy.Stream(auth, save_results())
#twitter_stream.filter(track=['Trump'])
USCan=[-165.5,26.1,-52.7,72.9]
twitter_stream.filter(locations=USCan,async=True)
对于问题c),这就是我检查转发内容的方法。让我注意到,它非常适合扩展推文,常规推文和引号:
with open('TweetFile.txt') as f:
for line in f:
try:
tweet = json.loads(line)
except json.decoder.JSONDecodeError:
pass
if (('text' in tweet) and ('place' in tweet)):
if tweet['place'] is not None and tweet['place']['country_code'] is not None:
if tweet['place']['country_code']=='US':
total_tweets_US +=1
if (('text' in tweet) and ('place' in tweet)):
if tweet['place'] is not None and tweet['place']['country_code'] is not None:
if tweet['place']['country_code']=='US':
if 'retweeted_status' in tweet and tweet["retweeted_status"]["extended_tweet"]["full_text"] is not None:
if any(s in tweet["status"]["retweeted_status"]["extended_tweet"]["full_text"].lower() for s in keyword1):
if any(t in tweet["status"]["retweeted_status"]["extended_tweet"]["full_text"].lower() for t in keyword2):
extended_retweets +=1
print("Of " + str(total_tweets_US) +" tweets," + str(extended_retweets) +" were extended retweets")
else:
if 'retweeted_status' in tweet and tweet['retweeted_status']['text'] is not None:
if any(s in tweet['retweeted_status']['text'].lower() for s in keyword1):
if any(t in tweet['retweeted_status']['text'].lower() for t in keyword2):
retweets +=1
print("Of " + str(total_tweets_US) +" tweets," + str(retweets) +" were retweets")
还有问题c),这是我如何进一步验证我的JSON文件中是否有任何转发的推文:
with open('TweetFile.txt') as f:
for line in f:
try:
tweet = json.loads(line)
except json.decoder.JSONDecodeError:
pass
if (('text' in tweet) and ('place' in tweet)):
if tweet['place'] is not None and tweet['place']['country_code'] is not None:
if tweet['place']['country_code']=='US':
total_tweets_US +=1
if 'text' in tweet and tweet['text'] is not None:
if tweet['retweeted']=="True":
print("Retweeted!")