如何超越'#34;超过每日下载限制"在废弃数据时

时间:2017-10-01 17:18:47

标签: r web-scraping proxy

我正在构建适用于历史股票数据库的预测工具。我从https://stooq.pl

下载所有历史价格时遇到问题

我的R代码工作正常,但我不知道如何避免下载限制(问题发生在我需要的约40次下载,如450)。代码如下:

stock<-c("06n", "08n", "11b", "1at", "4fm", "aal", "aat", "aba", "abc", "abe", "abm", "abs", "acg", "acp", "act", "adv", "ago", "agt", "ahl", "alc", "ali", "all", "alm", "alr", "amb", "amc", "aml", "ape", "apl", "apn", "apr", "apt", "arc", "arh", "arr","06n", "08n", "11b", "1at", "4fm", "aal", "aat", "aba", "abc", "abe", "abm", "abs", "acg", "acp", "act", "adv", "ago", "agt", "ahl", "alc", "ali", "all", "alm", "alr", "amb", "amc", "aml", "ape", "apl", "apn", "apr", "apt", "arc", "arh", "arr","06n", "08n", "11b", "1at", "4fm", "aal", "aat", "aba", "abc", "abe", "abm", "abs", "acg", "acp", "act", "adv", "ago", "agt", "ahl", "alc", "ali", "all", "alm", "alr", "amb", "amc", "aml", "ape", "apl", "apn", "apr", "apt", "arc", "arh", "arr") #example
Dane<- list()
i=1
for(c in stock){
  Dane[[i]]<-read.csv(url(paste("https://stooq.pl/q/d/l/?s=",c,"&i=d",sep="")))
  i=i+1
}

下载~40后出现此错误: [1] Przekroczony.dzienny.limit.wywolan(你已超过每日下载限制) - 这不是一个真正的错误,程序正在报废没有数据的文件,只有这条消息在里面。

有没有办法避免这个错误?我不知道不同的网页(我不确定是否有任何网页),我可以从中下载我需要的数据。

1 个答案:

答案 0 :(得分:0)

我无法让你的链接工作。无论如何,看看这个。

http://investexcel.net/multiple-stock-quote-downloader-for-excel/

enter image description here

显然它是Excel,而不是R,但它做得很好。此外,您可以尝试这样的事情。

codes <- c("MSFT","SBUX","S","AAPL","ADT")
urls <- paste0("https://www.google.com/finance/historical?q=",codes,"&output=csv")
paths <- paste0(codes,"csv")
missing <- !(paths %in% dir(".", full.name = TRUE))
missing

# simple error handling in case file doesn't exists
downloadFile <- function(url, path, ...) {
# remove file if exists already
if(file.exists(path)) file.remove(path)
# download file
tryCatch(
download.file(url, path, ...), error = function(c) {
# remove file if error
if(file.exists(path)) file.remove(path)
# create error message
c$message <- paste(substr(path, 1, 4),"failed")
message(c$message)
}
)
}
# wrapper of mapply
Map(downloadFile, urls[missing], paths[missing])