我正在使用Tweepy的Stream Listener,并希望检索有关英国当前政治辩论的推文。不幸的是,在RT和响应的情况下,我只得到截断的推文。
如: -
RT @ZaidJilani:Chuck Schumer(antiBDS法案的赞助商)说我们应该扼杀加沙。 Jeremy Corbyn说压迫他们会...
当fulltweet应该是: -
Chuck Schumer(antiBDS法案的赞助商)说我们应该扼杀加沙。杰里米·科尔宾(Jeremy Corbyn)说,压迫他们只会激化人们。
我已经看到有一种方法可以使用常规Twitter.API来扩展`tweet_mode =。但是我找不到与Streaming API类似的东西。有人有解决方案吗?我的代码如下: -
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
from redis import Redis
from rq import Queue
import requests
import time
import io
import os
import json
import threading
import multiprocessing
from datetime import datetime, timedelta
import _credentials
# twitter OAuth
ckey = _credentials.ckey
consumer_secret = _credentials.consumer_secret
access_token_key = _credentials.access_token_key
access_token_secret = _credentials.access_token_secret
#Listener Class Override
class listener(StreamListener):
def __init__(self, start_time, time_limit):
self.time = start_time
self.limit= time_limit
self.tweet_data = []
def on_data(self, data):
localtime = datetime.now().strftime("%Y-%b-%d--%H-%M-%S")
print(localtime)
while (time.time() - self.time) < self.limit:
try:
self.tweet_data.append(data)
return True
except BaseException:
print ('failed ondata')
time.sleep(5)
pass
saveFile = io.open(('raw_tweets_{}.json').format(localtime), 'w', encoding='utf-8')
saveFile.write(u'[\n')
saveFile.write(','.join(self.tweet_data))
saveFile.write(u'\n]')
saveFile.close()
exit()
def on_error(self, status):
print (status)
def on_disconnect(self, notice):
print ('bye')
#Beginning of the specific code
keyword_list = ['Theresa May', 'Jeremy Corbyn', 'GE2017', 'Labour', 'Tory','Tories'] #track list
start_time=time.time()
auth = OAuthHandler(ckey, consumer_secret) #OAuth object
auth.set_access_token(access_token_key, access_token_secret)
twitterStream = Stream(auth, listener(start_time, time_limit=10)) #initialize Stream object with a time out limit
twitterStream.filter(track=keyword_list, languages=['en']) #call the filter method to run the Stream Listener
答案 0 :(得分:3)
更新:似乎添加了对tweet_mode ='extended'的支持。
self.stream = Stream(auth = auth, listener = self, tweet_mode= 'extended')
tweet_data = json.loads(data)
if "extended_tweet" in tweet_data:
tweet = tweet_data['extended_tweet']['full_text']
PS。请原谅格式化,拼写错误等。我是堆叠溢出的新手,只是希望帮助其他人面对这个问题。
答案 1 :(得分:1)
现在已经过了一段时间,我认为支持全文。
在此链接:
https://developer.twitter.com/en/docs/tweets/tweet-updates
默认支持兼容性。我的(可能是丑陋的)代码显示了我如何处理它:
if 'extended_tweet' in raw_tweepy_data_object:
if 'full_text' in raw_tweepy_data_object['extended_tweet']:
text = raw_tweepy_data_object['extended_tweet']['full_text']
else:
pass # i need to figure out what is possible here
elif 'text' in raw_tweepy_data_object:
text = raw_tweepy_data_object['text']