我想从以下网站抓取一些数据: http://predstecajnenagodbe.fina.hr/pn-public-web/predmet/search 但是当我尝试使用rvest时:
library(rvest)
session <- html_session("http://predstecajnenagodbe.fina.hr/pn-public-web/predmet/search")
form <- html_form(session)
form
即使找到了表单,它也找不到(如您在页面上看到的那样)。
我也尝试过使用httr
软件包中的POST函数:
parameters <- list(since = "1.6.2018", until = "5.6.2018", `g-recaptcha-response` = "03AF6jDqXcBw1qmbrxWqadGqh9k8eHAzB9iPbYdnwzhEVSgCwO0Mi6DQDgckigpeMH1ikV70egOC0UppZsO7tO9hgdpEIaI04jTpG6JxGMR6wov27kEkLuVsEp1LhxZB4WFDRkDWdqcZeVN1YkiojUpje4k-swFG7tPyG2pJN86SdT290D9_0fyfrxlpfFNL2VUwE_c15vVthcBEdXIQ68V5qv7ZVooLiwrdTO2qLDLF1yUZWiu9IJoLuBWdFzJ_zdSP6fbuj5wTpfPdsYJ2n988Gcb3q2aYdn-2TVuWoQzqs1wbh7ya_Geo7_8gnDUL92l2nqTeV9CMY58fzppPPYDJcchdHFTTxadGwCGZyKC3WUSh81qiGZ5JhNDUpPnOO-MgSr5aPbA7tei7bbypHV9OOVjPGLLtqA9g")
httr::POST(
url,
body = parameters,
config = list(
add_headers(Referer = "http://predstecajnenagodbe.fina.hr"),
user_agent(get_header()),
accept_encoding = get_encoding(),
use_proxy("xxxx", port = 80,
username = "xxx", password = "xxxx"),
timeout(20L),
tcp_keepalive = FALSE
),
encode = "form",
verbose()
)
但它返回一些JS代码和消息:
请启用JavaScript以查看页面内容。您的支持ID为: 10544975822212666004
您能解释一下为什么rvest
无法识别表格以及POST无法正常工作吗?