我需要从“http://www.elections.state.md.us”下载一些csv文件。
这是我的代码。
url <- "http://www.elections.state.md.us/elections/2012/election_data/index.html"
# recognize the links
links <- getHTMLLinks(url)
filenames <- links[str_detect(links,"_General.csv")]
filenames_list <- as.list(filenames)
filenames
# create a function
downloadcsv <- function(filename,baseurl,folder){
dir.create(folder,showWarnings = FALSE)
fileurl <- str_c(baseurl,filename)
if(!file.exists(str_c(folder,"/",filename))){
download.file(fileurl,
destfile = str_c(folder,"/",filename))
# 1 sec delay between files
Sys.sleep(1)
}
}
library(plyr)
l_ply(filenames_list,downloadcsv,
baseurl = "www.elections.state.md.us/elections/2012/election_data/",
folder = "elec12_maryland")
错误出现为:
download.file出错(fileurl,destfile = str_c(文件夹,“/”, filename)):URL'www.elections.state.md.us/elections/2012/election_data/State_Congressional_Districts_2012_General.csv'不支持方案
但是,当我尝试将网址粘贴到IE中时,它确实有效。那么我的代码有什么问题呢?
任何想法都会有所帮助,谢谢。
答案 0 :(得分:2)
事实证明,url必须以http://,https://,ftp://或file://等方案开头。所以在最后一行,我将代码更改为
l_ply(filenames_list,downloadcsv,
baseurl = "http://www.elections.state.md.us/elections/2012/election_data/",
folder = "elec12_maryland")
它有效。