rvest软件包无法识别表格

时间:2019-02-12 15:52:53

标签: r post web-scraping rvest httr

我想从以下网站抓取一些数据: http://predstecajnenagodbe.fina.hr/pn-public-web/predmet/search 但是当我尝试使用rvest时:

library(rvest)
session <- html_session("http://predstecajnenagodbe.fina.hr/pn-public-web/predmet/search")
form <- html_form(session)
form

即使找到了表单,它也找不到(如您在页面上看到的那样)。

我也尝试过使用httr软件包中的POST函数:

parameters <- list(since = "1.6.2018", until = "5.6.2018", `g-recaptcha-response` = "03AF6jDqXcBw1qmbrxWqadGqh9k8eHAzB9iPbYdnwzhEVSgCwO0Mi6DQDgckigpeMH1ikV70egOC0UppZsO7tO9hgdpEIaI04jTpG6JxGMR6wov27kEkLuVsEp1LhxZB4WFDRkDWdqcZeVN1YkiojUpje4k-swFG7tPyG2pJN86SdT290D9_0fyfrxlpfFNL2VUwE_c15vVthcBEdXIQ68V5qv7ZVooLiwrdTO2qLDLF1yUZWiu9IJoLuBWdFzJ_zdSP6fbuj5wTpfPdsYJ2n988Gcb3q2aYdn-2TVuWoQzqs1wbh7ya_Geo7_8gnDUL92l2nqTeV9CMY58fzppPPYDJcchdHFTTxadGwCGZyKC3WUSh81qiGZ5JhNDUpPnOO-MgSr5aPbA7tei7bbypHV9OOVjPGLLtqA9g")

httr::POST(
  url,
  body = parameters, 
  config = list(
    add_headers(Referer = "http://predstecajnenagodbe.fina.hr"),
    user_agent(get_header()),
    accept_encoding = get_encoding(),
    use_proxy("xxxx", port = 80,
              username = "xxx", password = "xxxx"),
    timeout(20L),
    tcp_keepalive = FALSE
  ),
  encode = "form",
  verbose()
)

但它返回一些JS代码和消息:

  

请启用JavaScript以查看页面内容。您的支持ID为:   10544975822212666004

您能解释一下为什么rvest无法识别表格以及POST无法正常工作吗?

0 个答案:

没有答案