我有一个excel文件,其中包含公司名称和其.pdf文件的可下载链接。我的目的是按照excel列中的公司名称创建目录,并将pdf文件下载到新创建的目录中。
这是我的代码
##Set the working directory
txtsrc<-"C:\\FirstAid"
setwd(txtsrc)
##make a vector file names and links
pdflist <- read.xlsx("Final results_6thjuly.xlsx",1)
colnames(pdflist)
##Check if docs folder exists
if (dir.exists("FirstAid_docs")=="FALSE"){
dir.create("FirstAid_docs")
}
##Change the working directory
newfolder<-c("FirstAid_docs")
newpath<-file.path(txtsrc,newfolder)
setwd(newpath)
##Check the present working directory
getwd()
## Create directories and download files
for( i in 1:length(pdflist[,c("ci_CompanyName")])){
##First delete the existing directories
if(dir.exists(pdflist[,c("ci_CompanyName")][i])=="TRUE"){
unlink(pdflist[,c("ci_CompanyName")][i], recursive = TRUE)
}
##Create a new directory
directoryname<-pdflist[,c("ci_CompanyName")][i]
dir.create(directoryname,recursive = FALSE, mode = "0777")
##Get the downloadable links
##Link like :www.xyz.com thus need to add https to it
link<-pdflist[,c("DocLink")][i]
vallink<-c("https://")
##Need to remove quotes from link
newlink<-paste0(vallink,link)
newlink<-noquote(newlink)
##Set paths for the downloadble file
destfile<-file.path(txtsrc,newfolder,directoryname)
##Download the file
download.file(newlink,destfile,method="auto")
##Next record
i<-i+1
}
这是我得到的错误/结果
> colnames(pdflist)
[1] "ci_CompanyID" "ci_CompanyName" "ProgramScore" "ID_DI" "DocLink"
> download.file(newlink,destfile,method="auto")
Error in download.file(newlink, destfile, method = "auto") :
cannot open destfile 'C:\Users\skrishnan\Desktop\HR needed\text analysis proj\pdf\FirstAid/FirstAid_docs/Buckeye Partners, LP', reason 'Permission denied'
尽管设置chmod为何会出现错误。 我在Windows 64位计算机上使用CRAN RGui(64位)和R版本3.5.0。 任何帮助将不胜感激。
答案 0 :(得分:2)
destfile
中的 download.file
必须是特定文件,而不仅仅是目录。例如,
'C:\Users\skrishnan\Desktop\HR needed\text analysis proj\pdf\FirstAid\FirstAid_docs\Buckeye Partners, LP\myFile.pdf'
答案 1 :(得分:0)
最终的工作代码:
> ##Set the working directory txtsrc<-"C:\\FirstAid"
> setwd(txtsrc)
>
> ##make a vector file names and links pdflist <- read.xlsx("Final results_6thjuly.xlsx",1) colnames(pdflist)
>
> ##Check if docs folder exists if (dir.exists("FirstAid_docs")=="FALSE"){ dir.create("FirstAid_docs") }
>
> ##Change the working directory newfolder<-c("FirstAid_docs") newpath<-file.path(txtsrc,newfolder) setwd(newpath)
>
> ##Check the present working directory getwd()
>
> ## Create directories and download files
> for( i in 1:length(pdflist[,c("ci_CompanyName")])){
>
> ##First delete the existing directories
> if(dir.exists(pdflist[,c("ci_CompanyName")][i])=="TRUE"){
> unlink(pdflist[,c("ci_CompanyName")][i], recursive = TRUE)
> }
>
> ##Create a new directory
> directoryname<-pdflist[,c("ci_CompanyName")][i]
> dir.create(directoryname,recursive = FALSE, mode = "0777")
>
>
> ##Get the downloadable links
> ##Link like :www.xyz.com thus need to add https to it
> link<-pdflist[,c("DocLink")][i]
> vallink<-c("https://")
>
> ##Need to remove quotes from link
> newlink<-paste0(vallink,link)
> newlink<-noquote(newlink)
>
> ##Set paths for the downloadble file
> neway<-file.path(newpath,directoryname)
> destfile<-paste(neway,"my.pdf",sep="/")
>
>
>
> ##Download the file
> download.file(newlink,destfile,method="auto")
>
> ##Next record
> i<-i+1
> }