无法使用POST请求解析网页中的名称

时间:2017-08-12 19:48:30

标签: vba web-scraping http-post

我已经在vba中写了一个宏来获得一个名字"来自使用POST请求的网站。要访问目标页面,必须发送两次POST请求。首先,页面打开就像下面的第一个图像。点击"按地址搜索"在起始页面上的按钮指向另一个页面,其中有两个要填写的框,如下图2所示。一个用于街道号码,另一个用于街道名称。在表单填写完成后单击搜索按钮后,它会导致目标页面显示i后面的信息。我在脚本中使用msgbox对其进行了测试,以确保我在正确的页面上。我肯定在那个页面上,我可以看到该页面的标题是" HARRIS COUNTY评估区"。但是,我无法解析该页面中的任何内容。我以这个名字命名" LARA PEDRO A&玛丽亚G"从该页面。

这是我尝试的宏:

Sub httpPost()

    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim rec As HTMLHtmlElement
    Dim ArgStr As String, ArgStr_ano As String

    ArgStr = "search=addr"
    ArgStr_ano = "TaxYear=2017&stnum=15535&stname=CAMPDEN+HILL+RD"

    With http
        .Open "POST", "https://public.hcad.org/records/QuickSearch.asp", False
        .setRequestHeader "Content-type", "application/x-www-form-urlencoded"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"
        .setRequestHeader "Referer", "https://public.hcad.org/records/quicksearch.asp"
        .send ArgStr
    End With

    With http
        .Open "POST", "https://public.hcad.org/records/QuickRecord.asp", False
        .setRequestHeader "Content-type", "application/x-www-form-urlencoded"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"
        .setRequestHeader "Referer", "https://public.hcad.org/records/quicksearch.asp"
        .send ArgStr_ano
        html.body.innerHTML = .responseText
    End With

    MsgBox http.responseText
End Sub

搜索:

Street No: 15535
Street Name: CAMPDEN HILL RD

这是两页之后的图像,可以到达目标页面:

" https://www.dropbox.com/s/e9on9zwqzmcboze/1Untitled.jpg?dl=0" " https://www.dropbox.com/s/0lchpde8uq63jps/pics.jpg?dl=0"

我以某种方式使用chrome开发人员工具抓住了网址并在我的下面的代码中使用了该网址,我得到了之后的结果。但是,我的" POST"是什么问题?请求?为什么我不能使用上述方法获得相同的效果?为了您的考虑,这里有另一段代码来使用chrome dev工具中收集的url来获得结果我通过发送post请求得到的结果是我在上面的代码中所做的两次:

Sub Web_Data()
    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim post As Object

    With http
        .Open "GET", "https://public.hcad.org/records/details.asp?crypt=%94%9A%B0%94%BFg%85%8D%83%82og%8El%87tXvXQJXJzDTpHjEyr%D4%BE%C2%AF%AE%AA%9Fpk%88%5Do%5B%B8%96%A3%C0q%5E&bld=1&tab=", False
        .send
        html.body.innerHTML = .responseText
    End With

    For Each post In html.getElementsByClassName("data")(2).getElementsByTagName("th")
        i = i + 1: Cells(i, 1) = post.innerText
    Next post
End Sub

结果是:

LARA PEDRO A & MARIA G
15531 CAMPDEN HILL RD
HOUSTON TX 77053-3302

1 个答案:

答案 0 :(得分:1)

终于自己解决了。这是工作代码:

Sub httpPost()

    Dim http As New WinHttp.WinHttpRequest, html As New HTMLDocument
    Dim post As Object
    Dim ArgStr As String, ArgStr_ano As String

    ArgStr = "search=addr"
    ArgStr_ano = "TaxYear=2017&stnum=15535&stname=CAMPDEN+HILL+RD"

    With http
        .Option(6) = True
        .Open "POST", "https://public.hcad.org/records/QuickSearch.asp", False
        .setRequestHeader "Content-type", "application/x-www-form-urlencoded"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"
        .setRequestHeader "Referer", "https://public.hcad.org/records/quicksearch.asp"
        .send ArgStr
    End With

    With http
        .Option(6) = True
        .Open "POST", "https://public.hcad.org/records/QuickRecord.asp", False
        .setRequestHeader "Content-type", "application/x-www-form-urlencoded"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"
        .setRequestHeader "Referer", "https://public.hcad.org/records/quicksearch.asp"
        .send ArgStr_ano
        html.body.innerHTML = .responseText
    End With

    For Each post In html.getElementsByClassName("data")(2).getElementsByTagName("th")
        i = i + 1: Cells(i, 1) = post.innerText
    Next post

End Sub