我想请求帮助创建我需要下载多个文档的链接列表。
我正试图下载捷克共和国选区的动画。它可以在网站http://data.cuzk.cz/kontroly-dat-isui/00-volebni-okrsky/CSV-2014-10-01/上找到。但是,这些表以层次结构(区域 - 县 - 区)提供,大约有2 000个表,因此很难手动下载它们。
我已经找到了如何在层次结构中的每个网站中收集链接,但是找到如何编写适用于特定级别的所有网页的代码将是完美的。
#"scrape" links for regions
url <- "http://data.cuzk.cz/kontroly-dat-isui/00-volebni-okrsky/CSV-2014-10-01/"
webpage<- getURL(url,encoding="UTF-8")
PARSED <- htmlParse(webpage)
regions <- xpathSApply(PARSED, "//a", xmlValue)
links <-paste(url, regions, "/", sep="")
#"scrape" links for the counties in the first region (but I need to download links also in all other regions)
url_county <- "http://data.cuzk.cz/kontroly-dat-isui/00-volebni-okrsky/CSV-2014-10-01/Jihocesky_kraj/"
webpage_county <- getURL(url_county,encoding="UTF-8")
PARSED_county <- htmlParse(webpage_county)
county <- xpathSApply(PARSED_county, "//a", xmlValue)
links_counties <-paste(url, county, "/", sep="")
#and finally links for the districts in the county (but I need to download links also in all other counties in all other regions)
url_district <- "http://data.cuzk.cz/kontroly-dat-isui/00-volebni-okrsky/CSV-2014-10-01/Jihocesky_kraj/Ceske_Budejovice/"
webpage_district <- getURL(url_district,encoding="UTF-8")
PARSED_district <- htmlParse(webpage_district)
district <- xpathSApply(PARSED_district, "//a", xmlValue)
links_districts <-paste(url, district, sep="")
我尝试使用循环,但它不起作用。
for(i in 1:length(links)){
webpage_county <- getURL(links[i],encoding="UTF-8")
PARSED_county <- htmlParse(webpage_county)
links_counties <- xpathSApply(PARSED_county , "//a", xmlValue)
}
是否有人有任何建议如何解决这个问题?