我正在尝试使用Tweepy删除一些推文,但是几百个请求后连接崩溃并出现以下错误: tweepy.error.TweepError: 无法发送请求:('连接已中止。',错误("(104,' ECONNRESET')",))
我的代码是这样的:
for status in tweepy.Cursor(api.search,
q="",
count=100,
include_entities=True,
monitor_rate_limit=True,
wait_on_rate_limit=True,
wait_on_rate_limit_notify = True,
retry_count = 5, #retry 5 times
retry_delay = 5, #seconds to wait for retry
geocode ="34.0207489,-118.6926066,100mi", # los angeles
until=until_date,
lang="en").items():
try:
towrite = json.dumps(status._json)
output.write(towrite + "\n")
except Exception, e:
log.error(e)
c+=1
if c % 10000 == 0: # 100 requests, sleep
time.sleep(900) # sleep 15 min
我可以使用try / except捕获错误,但是我无法从崩溃的位置重新启动游标。 有谁知道如何解决这个错误,或者从上次已知状态重启光标?
谢谢!
答案 0 :(得分:1)
Tweepy文档说请求/ 15分钟窗口(用户身份验证)是180,但显然睡眠时间太长会影响连接可靠性(在一些请求之后)所以如果你每5秒运行一次请求,一切似乎都工作得很好:
for status in tweepy.Cursor(api.search,
q="",
count=100,
include_entities=True,
monitor_rate_limit=True,
wait_on_rate_limit=True,
wait_on_rate_limit_notify = True,
retry_count = 5, #retry 5 times
retry_delay = 5, #seconds to wait for retry
geocode ="34.0207489,-118.6926066,100mi", # los angeles
until=until_date,
lang="en").items():
try:
towrite = json.dumps(status._json)
output.write(towrite + "\n")
except Exception, e:
log.error(e)
c+=1
if c % 100 == 0: # first request completed, sleep 5 sec
time.sleep(5)
答案 1 :(得分:1)
在我看来,tweepy
调用应位于try
块内。此外,api.search中的参数不在Tweepy API(http://docs.tweepy.org/en/v3.5.0/api.html#help-methods)中。无论如何,这对我有用:
backoff_counter = 1
while True:
try:
for my_item in tweepy.Cursor(api.search, q="test").items():
# do something with my_item
break
except tweepy.TweepError as e:
print(e.reason)
sleep(60*backoff_counter)
backoff_counter += 1
continue
基本上,当你得到错误时,你会睡一会儿,然后再试一次。我使用增量退避来确保睡眠时间足以重新建立连接。