如何创建一个有效的for循环来解决twitteR的速率限制问题?

时间:2016-08-06 13:04:55

标签: r twitter rate-limiting

我对TwitteR和for循环的概念都很陌生。我已经看到这个代码来获取关注者和个人资料。

以下代码运行正常。不完全确定是否应该在如此长的时间内重新设定速率限制。

#This extracts all or most followers. 
followers<-getUser("twitter_handle_here")$getFollowerIDs(retryOnRateLimit=9999999)

以下代码是获取配置文件的for循环。

但是,我认为应该有一种方法可以使用length(followers)和getCurRateLimitInfo()来更好地构建循环。

我的问题是,如果长度(追随者)= 40000且速率限制= 180,那么如何构建循环以适当的时间睡眠并获得所有40000推特配置文件?

非常感谢任何帮助。

#This is the for loop to sleep for 5 seconds.
#Problem with this is it simply sleeps for X seconds
for (follower in followers){
  Sys.sleep(5)
  followers_info<-lookupUsers(followers)
  followers_full<-twListToDF(followers_info)
  }

1 个答案:

答案 0 :(得分:1)

以下是我为类似目的编写的一些代码,首先您需要定义此函数stall_rate_limit

stall_rate_limit <- function(limit) {

  # Store the record of all the rate limits into rate
  rate = getCurRateLimitInfo()
  message("Checking Rate Limit")

  if(any(as.numeric(rate[,3]) == 0)) {

    # Get the locations of API Calls that are used up
    index = which(as.numeric(rate[,3]) == 0)

    # get the time till when rates limits Reset
    wait = as.POSIXct(min(rate[index,4]),     ## Reset times in the 4th col
                      origin = "1970-01-01",  ## Origin of Unix Time
                      tz = "US/Mountain")     ## Replace with your Timezone

    message(paste("Waiting until", wait,"for Godot to reset rate limit"))
    # Tell the computer to sleep until the rates reset
    Sys.sleep(difftime(wait, Sys.time(), units = "secs"))

    # Set J = to 0
    J = 0
    # Return J as a counter
    return(J)

    } else {

    # Count was off, Try again
    J = limit - 1
    return(J)

    }
}

然后你可以运行这样的代码:

callsMade = 0    ## This is your counter to count how many calls were made
limit = 180      ## the Limit of how many calls you can make
for(i in 1:length(followers)){

  # Check to see if you have exceeded your limit
  if(callsMade >= limit){

    # If you have exceeded your limit, wait and set calls made to 0 
    callsMade = stall_rate_limit(limit)

  }

  ### Execute your Code Here ... ###

  callsMade = callsMade + 1  # or however many calls you have made
}