Tweepy错误104:连接中止

时间:2016-09-21 14:24:19

标签: twitter connection tweepy

我正在尝试使用Tweepy删除一些推文,但是几百个请求后连接崩溃并出现以下错误: tweepy.error.TweepError:     无法发送请求:('连接已中止。',错误("(104,' ECONNRESET')",))

我的代码是这样的:

  for status in tweepy.Cursor(api.search,
                           q="",
                           count=100,
                           include_entities=True,
                           monitor_rate_limit=True, 
                           wait_on_rate_limit=True,
                           wait_on_rate_limit_notify = True,
                           retry_count = 5, #retry 5 times
                           retry_delay = 5, #seconds to wait for retry
                           geocode ="34.0207489,-118.6926066,100mi", # los angeles
                           until=until_date,
                           lang="en").items():

      try:
        towrite = json.dumps(status._json)
        output.write(towrite + "\n")
      except Exception, e:
        log.error(e)
      c+=1
      if c % 10000 == 0:  # 100 requests, sleep
        time.sleep(900) # sleep 15 min

我可以使用try / except捕获错误,但是我无法从崩溃的位置重新启动游标。 有谁知道如何解决这个错误,或者从上次已知状态重启光标?

谢谢!

2 个答案:

答案 0 :(得分:1)

Tweepy文档说请求/ 15分钟窗口(用户身份验证)是180,但显然睡眠时间太长会影响连接可靠性(在一些请求之后)所以如果你每5秒运行一次请求,一切似乎都工作得很好:

   for status in tweepy.Cursor(api.search,
                       q="",
                       count=100,
                       include_entities=True,
                       monitor_rate_limit=True, 
                       wait_on_rate_limit=True,
                       wait_on_rate_limit_notify = True,
                       retry_count = 5, #retry 5 times
                       retry_delay = 5, #seconds to wait for retry
                       geocode ="34.0207489,-118.6926066,100mi", # los angeles
                       until=until_date,
                       lang="en").items():

  try:
    towrite = json.dumps(status._json)
    output.write(towrite + "\n")
  except Exception, e:
    log.error(e)
  c+=1
  if c % 100 == 0:  # first request completed, sleep 5 sec
    time.sleep(5)

答案 1 :(得分:1)

在我看来,tweepy调用应位于try块内。此外,api.search中的参数不在Tweepy API(http://docs.tweepy.org/en/v3.5.0/api.html#help-methods)中。无论如何,这对我有用:

backoff_counter = 1
while True:
    try:
        for my_item in tweepy.Cursor(api.search, q="test").items():
            # do something with my_item
        break
    except tweepy.TweepError as e:
        print(e.reason)
        sleep(60*backoff_counter)
        backoff_counter += 1
        continue

基本上,当你得到错误时,你会睡一会儿,然后再试一次。我使用增量退避来确保睡眠时间足以重新建立连接。