R - Web报废和下载多个zip文件并保存文件而不会覆盖

时间:2017-10-19 21:03:20

标签: r web-scraping rvest

尝试使用网络链接下载多个zip文件。使用这种方法,下载文件被覆盖,因为文件名在多年内是相同的 -

library(rvest)

url <- "https://download.open.fda.gov/"
page <- read_html(url)

zips <- grep("\\/drug-event",html_nodes(page,"key"), value=TRUE)
zips_i<-gsub(".*\\/drug\\/","drug/",zips)
zips_ii<-gsub("</key>","",zips_i)
zips_iii<-paste0(url, zips_ii)

lapply(zips_iii, function(x) download.file(x, basename(x)))

有没有办法不覆盖下载的文件?

1 个答案:

答案 0 :(得分:1)

这是我到目前为止所得到的 -

#load the library
library(rvest)

#link to get the data from
url <- "https://download.open.fda.gov/"
page <- read_html(url)

#clean the URL
zips <- grep("\\/drug-event",html_nodes(page,"key"), value=TRUE)
zips_i<-gsub(".*\\/drug\\/","drug/",zips)
zips_ii<-gsub("</key>","",zips_i)
zips_iii<-paste0(url, zips_ii)

#destination vectors
id=1:length(zips_iii)
destination<-paste0("~/Projects/Projects/fad_ade/",id)

#file extraction
mapply(function(x, y) download.file(x,y, mode="wb"),x = zips_iii, y = destination)