RCurl - 提交表单并加载页面

时间:2015-02-14 19:14:26

标签: xml r rcurl rvest

我使用套餐RCurl从巴西的网站下载一些价格,但为了加载数据,我必须先从表格中选择一个城市。

该网站是: " http://www.muffatosupermercados.com.br/Home.aspx"

我想要CURITIBA的价格,id = 53。

我尝试使用此帖子中提供的解决方案: " How do I use cookies with RCurl?"

这是我的代码:

    library("RCurl")
    library("XML")

    #Set your browsing links 
    loginurl = "http://www.muffatosupermercados.com.br"
    dataurl  = "http://www.muffatosupermercados.com.br/CategoriaProduto.aspx?Page=1&c=2"

    #Set user account data and agent
    pars=list(
            id = "53"
    )
    agent="Mozilla/5.0" #or whatever 

    #Set RCurl pars
    curl = getCurlHandle()
    curlSetOpt(cookiejar="cookies.txt",  useragent = agent, followlocation =TRUE, curl=curl)
    #Also if you do not need to read the cookies. 
    #curlSetOpt(  cookiejar="", useragent = agent, followlocation = TRUE, curl=curl)

    #Post login form
    html=postForm(loginurl, .params = pars, curl=curl)

    #Go wherever you want
    html=getURL(dataurl, curl=curl)
    C1 <- htmlParse(html, asText=TRUE, encoding="UTF-8") 
    Preco <- C1 %>% html_nodes(xpath = "//li[@class='preco']") %>% html_text(xmlValue, trim = TRUE)

但是当我运行代码时,我只得到表单后面的页面,而不是目标页面:

&#34; http://www.muffatosupermercados.com.br/CategoriaProduto.aspx?Page=1&c=2&#34;

我也试过玩饼干,但没有运气。

有没有人知道如何提交此表单并加载正确的页面?

事先提前......

0 个答案:

没有答案