我正在提取Twitter用户的关注者,我将tweepy与Python 3.x一起使用。对于拥有少于20万关注者的用户来说非常合适,最重要的是,我遇到了超时错误。
consumer_key = '....'
consumer_secret = '....'
access_token = '....'
access_secret = '....'
#twitter connection api
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
user_ids = []
try:
for page in tweepy.Cursor(api.followers_ids, id=twitter_account, count=5000).pages():
user_ids.extend(page)
except tweepy.RateLimitError:
logging.info ("RateLimitError...waiting 1000 seconds to continue")
time.sleep(1000)
for page in tweepy.Cursor(api.followers_ids, id=twitter_account, count=5000).pages():
user_ids.extend(page)
following = []
错误:
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='api.twitter.com', port=443): Read timed out.
尝试设置API超时,但未成功。
api = tweepy.API(auth, timeout=200000, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
设置选项wait_on_rate_limit = True,在日志中写入五到六次:
Rate limit reached. Sleeping for: 8xx
答案 0 :(得分:0)
您实际上正在做的是:
正确的方法应该是,因为您可以在15分钟内发送15个请求,所以每个请求之间必须有1分钟的间隔:
for page in tweepy.Cursor(api.followers_ids, id=twitter_account, count=5000).pages():
user_ids.extend(page)
time.sleep(60000)