我需要填写页面的月份和年份字段:
通过这个,我在Rselenium中编写了以下内容并且它可以正常工作
#library
library(RSelenium)
#browser parameters
mybrowser<-remoteDriver(browserName = "chrome")
mybrowser$open(silent = TRUE)
mybrowser$setTimeout(type = "page load", milliseconds =1000000)
mybrowser$setImplicitWaitTimeout(milliseconds = 1000000)
url<-paste("http://www.svs.cl/institucional/mercados/entidad.php?mercado=S&rut=99588060&grupo=&tipoentidad=CSVID&row=AABaHEAAaAAAB7uAAT&vig=VI&control=svs&pestania=3",sep="")
#start navigation
mybrowser$navigate(url)
webElem$clickElement()
wxbox<-mybrowser$findElement(using="class","bordeInput2")
wxbox$sendKeysToElement(list("09"))
wxbox<-mybrowser$findElement(using="id","aa")
wxbox$sendKeysToElement(list("2016"))
wxbutton<-mybrowser$findElement('xpath',"//*[@id='fm']/div[2]/input")
wxbutton$clickElement()
但是,我想看一个使用rvest或rcurl的解决方案,我已经尝试过,它对我不起作用。如果有人可以帮助我,我会很感激。
我的尝试是
library(RCurl)
library(XML)
form <- postForm("Http://www.svs.cl/institucional/mercados/entidad.php?mercado=S&rut=99588060&grupo=&tipoentidad=CSVID&row=AABaHEAAaAAAB7uAAT&vig=VI&control=svs&pestania=3", Year = 2010, Month = 2)
doc <- htmlParse(form) pkids <- xpathSApply(doc, xmlAttrs)
pkids
data <- lapply(pkids)
tab <- readHTMLTable(data[[1]], which = 1)
首先,谢谢
答案 0 :(得分:0)
您可以按如下方式POST
到网址:
require(rvest)
require(httr)
a <- POST("http://www.svs.cl/institucional/mercados/entidad.php",
# Body = what you fill in the form
body = list(mm = 09, aa = 2016),
# query = the long URL broken into parameter
query = list(mercado="S",
rut="99588060",
grupo="",
tipoentidad="CSVID",
row="AABaHEAAaAAAB7uAAT",
vig="VI",
control="svs",
pestania="3"))
read_html(a) %>% html_nodes("dd") %>% html_text %>%
setNames(c("Business name", "RUT"))
这给了你:
Business name RUT
"ACE SEGUROS DE VIDA S.A." "99588060-1"