如何将HTML可搜索查询数据自动化并导出到excel中

时间:2012-11-13 16:02:28

标签: asp.net excel-vba xhtml vba excel

我有兴趣从中提取数据的网页有一个包含多个搜索字段的表格。我可以在任何这些字段中输入数据,然后单击表格底部的搜索按钮,根据我想要搜索的信息查看结果。

我想要搜索多个数字(大约300个),而不是单独搜索每个数字,有没有办法自动搜索数据并将数据导入到我要搜索的每个数字的Excel工作表中?

是否可以使用Excel宏?

1 个答案:

答案 0 :(得分:1)

您可以使用MSXML和MSHTML库。这段代码可以帮助您入门 首先运行此子程序以添加两个引用(您只需要运行一次):

Sub addReferences()
    ActiveWorkbook.VBProject.References.AddFromGuid "{3050F1C5-98B5-11CF-BB82-00AA00BDCE0B}", 4, 0
    ActiveWorkbook.VBProject.References.AddFromGuid "{F5078F18-C551-11D3-89B9-0000F81FE221}", 6, 0
End Sub

然后编辑getCAGEValues子以导入您的CAGE代码并保存结果数据(以及您希望从页面获得的任何其他数据):

Sub getCAGEValues()
    Dim oHTMLDoc As MSHTML.HTMLDocument
    Dim oSpan As MSHTML.HTMLGenericElement
    Dim CAGECodes() As Variant
    CAGECodes = Array("12345", "12346") 'CAGECodes is an array of your codes'
    For Each CAGECode In CAGECodes
        Set oHTMLDoc = getPage(CAGECode)
        Set oSpan = oHTMLDoc.getElementById("ctl00_cphMainPageBody_lblCompNameData") 'The id for the company name'
        MsgBox oSpan.innerText 'Save the value however you want to.'
    Next
End Sub

Function getPage(CAGECode As Variant) As MSHTML.HTMLDocument
    Dim oHttpRequest As MSXML2.XMLHTTP60
    Set oHttpRequest = New MSXML2.XMLHTTP60
    With oHttpRequest
        .Open "GET", "http://www.logisticsinformationservice.dla.mil/BINCS/details.aspx?CAGE=" & CAGECode, False
        .setRequestHeader "Cache-Control", "no-cache"
        .setRequestHeader "Pragma", "no-cache"
        .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
        .send
    End With
    Dim oHTMLDoc As MSHTML.HTMLDocument
    Set oHTMLDoc = New MSHTML.HTMLDocument
    oHTMLDoc.body.innerHTML = oHttpRequest.responseText
    Set getPage = oHTMLDoc
End Function