我正在尝试使用以下代码从此页面中搜索每个搜索结果的名称:
url2 <- "http://www.truckandtrailer.ca/search.cfm?intIndustryID=2&searchtype=advanced&pageaction=showresults&bitNew=0&intCategoryID=30&intMakeID=0&intSelectProvinceID=&x=26&y=6"
results <- url2 %>%
html() %>%
html_nodes(".desc_title") %>%
html_text()
results
但它只会返回:
character(0)
有关如何解决此问题的任何想法?感谢帮助!
答案 0 :(得分:5)
以下是使用RSelenium和rvest的解决方案。
注意:有关使用RSelenium和rvest的信息,请参阅我的回答here。
library(RSelenium)
library(rvest)
startServer()
remDr <- remoteDriver(browserName = 'firefox')
remDr$open()
url2 <- "http://www.truckandtrailer.ca/search.cfm?intIndustryID=2&searchtype=advanced&pageaction=showresults&bitNew=0&intCategoryID=30&intMakeID=0&intSelectProvinceID=&x=26&y=6"
remDr$navigate(url2)
test.html <- html(remDr$getPageSource()[[1]])
results<-test.html %>%
html_nodes(".desc_title") %>%
html_text(trim=TRUE)
results
[1] "2009 FREIGHTLINER FLD 132 CLASSIC XL HIGHWAY TR..." "2014 FREIGHTLINER CASCADIA HIGHWAY TRACTOR"
[3] "2014 KENWORTH W900-L HIGHWAY TRACTOR" "2014 KENWORTH T660 HIGHWAY TRACTOR"
[5] "2013 FREIGHTLINER CASCADA HIGHWAY TRACTOR" "2013 FREIGHTLINER CASCADA HIGHWAY TRACTOR"
[7] "(5) 2013 FREIGHTLINER CASCADIA - 113 HIGHWAY TR..." "(2) 2013 INTERNATIONAL PROSTAR HIGHWAY TRACTOR"
[9] "2013 KENWORTH T660 HIGHWAY TRACTOR" "2013 KENWORTH W900B HIGHWAY TRACTOR"
[11] "(2) 2013 KENWORTH T700 HIGHWAY TRACTOR" "2013 KENWORTH W900 HIGHWAY TRACTOR"
[13] "2013 KENWORTH T660 HIGHWAY TRACTOR" "2013 KENWORTH W900L HIGHWAY TRACTOR"
[15] "2013 KENWORTH W900L HIGHWAY TRACTOR" "2013 KENWORTH W900 HIGHWAY TRACTOR"
[17] "2013 PETERBILT 388 HIGHWAY TRACTOR" "(5) 2013 PETERBILT 388 HIGHWAY TRACTOR"
[19] "2013 PETERBILT 389 HIGHWAY TRACTOR" "2013 PETERBILT 388 HIGHWAY TRACTOR"
[21] "2013 VOLVO VNL670 HIGHWAY TRACTOR" "2013 VOLVO VNL630 HIGHWAY TRACTOR"
[23] "(5) 2012 FREIGHTLINER CASCADIA HIGHWAY TRACTOR" "2012 FREIGHTLINER CA125 HIGHWAY TRACTOR"
[25] "(2) 2012 FREIGHTLINER CASCADIA HIGHWAY TRACTOR"
remDr$close()
另一种方法是使用Phantomjs(不需要使用cmd,也不需要额外的浏览器)。这里唯一需要的是从here下载exe文件并将其放在R工作目录中(如果不想将其放在工作目录中,也可以指定路径)。
library(RSelenium)
library(rvest)
pJS <- phantom(extras = c('--ssl-protocol=tlsv1'))
remDr <- remoteDriver(browserName = "phantom")
remDr$open()
remDr$navigate("http://www.truckandtrailer.ca/search.cfm?intIndustryID=2&searchtype=advanced&pageaction=showresults&bitNew=0&intCategoryID=30&intMakeID=0&intSelectProvinceID=&x=26&y=6")
test.html <- html(remDr$getPageSource()[[1]])
results<-test.html %>%
html_nodes(".desc_title") %>%
html_text(trim=TRUE)
> results
[1] "2009 FREIGHTLINER FLD 132 CLASSIC XL HIGHWAY TR..." "2014 FREIGHTLINER CASCADIA HIGHWAY TRACTOR"
[3] "2014 KENWORTH W900-L HIGHWAY TRACTOR" "2014 KENWORTH T660 HIGHWAY TRACTOR"
[5] "2013 FREIGHTLINER CASCADA HIGHWAY TRACTOR" "2013 FREIGHTLINER CASCADA HIGHWAY TRACTOR"
[7] "(5) 2013 FREIGHTLINER CASCADIA - 113 HIGHWAY TR..." "(2) 2013 INTERNATIONAL PROSTAR HIGHWAY TRACTOR"
[9] "2013 KENWORTH T660 HIGHWAY TRACTOR" "2013 KENWORTH W900B HIGHWAY TRACTOR"
[11] "(2) 2013 KENWORTH T700 HIGHWAY TRACTOR" "2013 KENWORTH W900 HIGHWAY TRACTOR"
[13] "2013 KENWORTH T660 HIGHWAY TRACTOR" "2013 KENWORTH W900L HIGHWAY TRACTOR"
[15] "2013 KENWORTH W900L HIGHWAY TRACTOR" "2013 KENWORTH W900 HIGHWAY TRACTOR"
[17] "2013 PETERBILT 388 HIGHWAY TRACTOR" "(5) 2013 PETERBILT 388 HIGHWAY TRACTOR"
[19] "2013 PETERBILT 389 HIGHWAY TRACTOR" "2013 PETERBILT 388 HIGHWAY TRACTOR"
[21] "2013 VOLVO VNL670 HIGHWAY TRACTOR" "2013 VOLVO VNL630 HIGHWAY TRACTOR"
remDr$close
pJS$stop()
P.S。有关详细信息,请参阅help文件。