为了获取一些财务报表,我正在尝试获取文档传递协议编号列表。
以下网址包含指向给定公司的所有文档类别的链接。
u1 <- "http://siteempresas.bovespa.com.br/consbov/ExibeTodosDocumentosCVM.asp?CCVM=22446&CNPJ=09.414.761/0001-64&TipoDoc=C"
点击DFP,我会重定向到包含协议编号的其他页面。问题是我无法在R中获得相同的结果。
我试过httr :: POST但没有成功。
library(httr)
page <- GET(u1, encoding = "ISO-8859-1")
key <- cookies(page)
pgpost <- POST(u1,
body = list(hdnCategoria = "IDI2",
action = "ExibeTodosDocumentosCVM.asp?CNPJ=09.414.761/0001-64&CCVM=22446&TipoDoc=C&QtLinks=10"),
set_cookies(ASPSESSIONIDQATQCCSC = key$value[1],
TS01871345 = key$value[2],
ASPSESSIONIDSQQTABSC = key$value[3],
ASPSESSIONIDSCDSBADC = key$value[4]))
pgcont <- content(pgpost, "text", encoding = "ISO-8859-1")
pgcont <- strsplit(pgcont, "\r")[[1]]
pgcont <- gsub('[\n\t]', "", pgcont); pgcont
pgcont
向我展示了来自u1
我也尝试使用rvest点击链接
library(rvest)
s <- html_session(u1)
s %>% follow_link("DFP")
但最终出现此错误消息
[1] Navigating to javascript:fVisualizaDocumentos('C','IDI2')
Error in curl::curl_fetch_memory(url, handle = handle) :
Couldn't resolve host name
关于如何解决这个问题的任何想法?提前谢谢!
Here is a picture of the information I'm looking for
答案 0 :(得分:0)
我认为你不需要会话cookie:
library(httr)
library(rvest)
library(tidyverse)
httr::POST(
encode = "form",
url = "http://siteempresas.bovespa.com.br/consbov/ExibeTodosDocumentosCVM.asp",
query = list(
CNPJ = "09.414.761/0001-64",
CCVM = "22446",
TipoDoc = "C",
QtLinks = "10"
),
body = list(
hdnCategoria = "IDI2",
hdnPagina = "",
FechaI = "",
FechaV = ""
)) -> res
content(res, encoding = "ISO-8859-1") %>%
html_nodes("table")
## {xml_nodeset (21)}
## [1] <table width="640" border="0" cellspacing="0" cellpadding="0" align ...
## [2] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [3] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [4] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [5] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [6] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [7] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [8] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [9] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [10] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [11] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [12] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [13] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [14] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [15] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [16] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [17] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [18] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [19] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## [20] <table width="95%" border="0" cellspacing="1" align="center" cellpa ...
## ...