我正在尝试抓取以下网页:http://www.qedu.org.br/escola/112633-militar-de-salvador/enem?edition=2009。在互联网上搜索,我找到了以下脚本:
fetch_attendance <- function (year) {
url <- paste0 ("http://www.qedu.org.br/escola/112633-colegio-militar-de- salvador/enem?edition=", year)
date <- url%>%
httr :: GET ()%>%
httr :: content ('text', encoding = 'utf-8')%>%
xml2 :: read_html ()%>%
rvest :: html_nodes (xpath = '// div [@ class = "span4 score-ct"] / p [1]')%>%
html_text ()%>%
gsub ("^ \\ s + | \\ s + $ | score", "",.)
date
}
library (plyr)
res <- ldply (2009: 2016, fetch_attendance, .progress = "text")
然而,结果如下:
> res <- ldply (2009: 2016, fetch_attendance, .progress = "text")
=========================================== ===================== | 100%
> res
V1 V2 V3 V4 V5 V6
1 97% 635pts 605pts 595pts 659pts 735pts
2 97% 635pts 605pts 595pts 659pts 735pts
3 97% 635pts 605pts 595pts 659pts 735pts
4 97% 635pts 605pts 595pts 659pts 735pts
5 97% 635pts 605pts 595pts 659pts 735pts
6 97% 635pts 605pts 595pts 659pts 735pts
7 97% 635pts 605pts 595pts 659pts 735pts
8 97% 635pts 605pts 595pts 659pts 735pts
也就是说,它只返回2009年的结果。任何人都可以帮助我吗? 我还希望获得2010年至2016年的价值。