我正在尝试编写一个函数,并使用apply
将其调用到我的数据集中的每一行。数据集包含zip文件的URL,这些URL将被下载,解压缩,解压后TXT和zip文件将从工作目录中删除。
head(data)
data URL
1 /files/market_valuation/ru/2017/val170502170509.zip http://www.kase.kz/files/market_valuation/ru/2017/val170502170509.zip
2 /files/market_valuation/ru/2017/val170424170430.zip http://www.kase.kz/files/market_valuation/ru/2017/val170424170430.zip
3 /files/market_valuation/ru/2017/val170417170423.zip http://www.kase.kz/files/market_valuation/ru/2017/val170417170423.zip
4 /files/market_valuation/ru/2017/val170410170416.zip http://www.kase.kz/files/market_valuation/ru/2017/val170410170416.zip
5 /files/market_valuation/ru/2017/val170403170409.zip http://www.kase.kz/files/market_valuation/ru/2017/val170403170409.zip
6 /files/market_valuation/ru/2017/val170327170402.zip http://www.kase.kz/files/market_valuation/ru/2017/val170327170402.zip
我的功能:
Price_KASE <- function(data){
URL = data[,2]
dir = basename(URL)
download.file(URL, dir)
unzip(dir)
TXT <- list.files(pattern = "*.TXT")
zip <- list.files(pattern = "*.zip")
file.remove(TXT, zip)
}
apply(data, 1, Price_KASE(data))
错误信息:
Error in download.file(URL, dir) :
'url' must be a length-one character vector
请解释我的代码有什么问题,我该如何解决? 谢谢。
使用for
循环的替代方式:
for (i in 1:length(data[,2])){
URL = data[i, 2]
dir = basename(URL)
download.file(URL, dir)
unzip(dir)
TXT <- list.files(pattern = "*.TXT")
zip <- list.files(pattern = "*.zip")
file.remove(TXT, zip)
}
它似乎工作正常,但在第4或第5个文件后,我得到In download.file(URL, dir) :
cannot open URL 'http://www.kase.kz/files/market_valuation/ru/2017/val170410170416.zip': HTTP status was '503 Service Temporarily Unavailable'
答案 0 :(得分:0)
我认为在您的数据框中,您的网址存储为因子变量。尝试使用:
data[,2] <- as.character(data[,2])
如果您将其读作.csv或构建数据框,请考虑设置stringsAsFactors = FALSE。
<强>更新强>
当你尝试在apply中使用1时我发现了一些东西,它将所有的行都作为一个向量。所以你还必须改变你的功能。请参阅下面的粗体部分。此代码完全在下面的示例中运行,给出了输出。
data1 <- data.frame(a = "/files/market_valuation/ru/2017/val170502170509.zip",
b = "http://www.kase.kz/files/market_valuation/ru/2017/val170502170509.zip")
Price_KASE <- function(data){
**URL = data[2]**
dir = basename(URL)
download.file(URL, dir)
unzip(dir)
TXT <- list.files(pattern = "*.TXT")
zip <- list.files(pattern = "*.zip")
file.remove(TXT, zip)
}
data1$b <- as.character(data1$b)
apply(data1, 1, Price_KASE)
# [,1]
#[1,] TRUE
#[2,] TRUE