我似乎无法确定与 RSelenium 一起使用的正确 css 选择器来返回任何数据。 该网站是:https://www.rbcroyalbank.com/investments/gic-rates.html
所需的数据是不可赎回的 GIC 利率,每年支付的利息(第二列):1、2、3、4、5、7、10
一些失败的努力
library("RSelenium")
library("rvest")
library("httr")
library("tidyverse")
remDr$navigate("https://www.rbcroyalbank.com/investments/gic-rates.html")
webElem <- remDr$findElement(using = "css selector", value = "tr:nth-child(7) .text-center:nth-child(2) div")
# OR
pg <- remDr$getPageSource()[[1]]
df <- tibble(Rates = pg %>%
read_html() %>%
html_nodes(xpath = '//tr[(((count(preceding-sibling::*) + 1) = 6) and parent::*)]//*[contains(concat( " ", @class, " " ), concat( " ", "text-center", " " )) and (((count(preceding-sibling::*) + 1) = 2) and parent::*)]//div') %>%
html_text())
答案 0 :(得分:1)
下面是一个可能的解决方案。
#Library to scrape the infomration Version 1.7.7 (mandatory)
library(RSelenium)
driver <- rsDriver(browser=c("firefox"), port = 4567L)
#Defines the client part.
remote_driver <- driver[["client"]]
remote_driver$navigate("https://www.rbcroyalbank.com/investments/gic-rates.html")
webElem <- remote_driver$findElement(using = "css selector", value = "#gic-nrg")$clickElement()
x<-remote_driver$findElement(using = "css selector", value = "#guaranteed-return-1 > div:nth-child(1) > table:nth-child(1)")
df<-read.table(text=gsub(' ', '\n', x$getElementText()), header=TRUE)
df[c(-1:-46),]