RCurl :: getURL超过了最大客户端数量

时间:2015-06-17 09:56:34

标签: r curl ftp rcurl

我试图反复列出MODIS Global Evapotranspiration Project(MOD16)的FTP服务器上托管的文件。

## required package
library(RCurl)

## ftp server
ch_ftp <- "ftp://ftp.ntsg.umt.edu/pub/MODIS/NTSG_Products/MOD16/MOD16A2.105_MERRAGMAO/"

## list and reformat available subfolders
ch_fls <- getURL(ch_ftp, verbose = TRUE, dirlistonly = TRUE)

ls_fls <- strsplit(ch_fls, "\n")
ch_fls <- unlist(ls_fls)

## list files in current folder
for (i in ch_fls) {

  ch_hdf <- paste0(ch_ftp, i)
  getURL(ch_hdf, verbose = TRUE, dirlistonly = TRUE)
}

经过一些迭代后,RCurl::getURL会抛出以下错误消息。

< 530 Sorry, the maximum number of clients (5) from your host are already connected.
* Access denied: 530
* Closing connection 16
 Show Traceback

 Rerun with Debug
 Error in function (type, msg, asError = TRUE)  : Access denied: 530 

显然,RCurl::getURL在每次迭代期间都会打开与FTP服务器的连接,而不会足够快地关闭它们。几分钟后,服务器再次可访问,但重新初始化脚本并等待前几次迭代时将抛出相同的错误消息。有没有办法在检索到文件列表后立即手动关闭RCurl::getURL建立的连接?

1 个答案:

答案 0 :(得分:2)

我正在处理同样的问题。

使用Sys.sleep(2)为我修复了它。

## list files in current folder
for (i in ch_fls) {

  ch_hdf <- paste0(ch_ftp, i)
  getURL(ch_hdf, verbose = TRUE, dirlistonly = TRUE)
  Sys.sleep(2)
}