将表格从网页导入到Excel

时间:2013-12-30 01:14:41

标签: excel vba excel-vba web

我只有excel的中高级别的excel和VBA的中间背景。我想要做的是从链接中显示的网页导入表格:http://www.admision.unmsm.edu.pe/res20130914/A/011/0.html

该网页显示39个链接,每个链接包含一个表格。 所以我想知道一种将所有这些表导入excel的自动方式。

2 个答案:

答案 0 :(得分:2)

此代码从所有链接获取数据。

Sub Extract_data()

    Dim url As String, links_count As Integer
    Dim i As Integer, j As Integer, row As Integer
    Dim XMLHTTP As Object, html As Object
    Dim tr_coll As Object, tr As Object
    Dim td_coll As Object, td As Object

    links_count = 39
    For i = 0 To links_count

        url = "http://www.admision.unmsm.edu.pe/res20130914/A/011/" & i & ".html"

        Set XMLHTTP = CreateObject("MSXML2.XMLHTTP")
        XMLHTTP.Open "GET", url, False
        XMLHTTP.send

        Set html = CreateObject("htmlfile")
        html.body.innerHTML = XMLHTTP.ResponseText

        Set tbl = html.getelementsbytagname("Table")

        Set tr_coll = tbl(0).getelementsbytagname("TR")

        For Each tr In tr_coll
            j = 1
            Set td_col = tr.getelementsbytagname("TD")

            For Each td In td_col
                Cells(row + 1, j).Value = td.innerText
                j = j + 1
            Next
            row = row + 1
        Next
    Next

    MsgBox "Done"
End Sub

答案 1 :(得分:0)

您可能会发现这更容易遵循,因为早期绑定使您可以使用智能感知。此外,这段代码还允许您有时间从get方法获取响应。

您需要同时引用Microsoft XML,v6.0和Microsoft HTML对象库

Sub main()

    Dim URL As String
        URL = "YOUR URL HERE"

    Dim xml As MSXML2.XMLHTTP
        Set xml = New MSXML2.XMLHTTP

    Dim html As HTMLDocument
        Set html = New HTMLDocument

    With xmlOne
        .Open "GET", URLOne
        .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
        .send
    End With

    With xml
        While Not .readyState = 4
            Application.Wait Now + TimeValue("0:00:01")
        Wend
        If .Status = 200 Then
                While InStr(1, .responseText, "Updating", 0) > 0
                    Application.Wait Now + TimeValue("0:00:01")
                Wend
                html.body.innerHTML = .responseText
            End If
        End With

    Dim tbl As MSHTML.HTMLTable
        Set tbl = html.getElementsByTagName("Table")(TABLE #)

    Dim tblCells As MSHTML.IHTMLElementCollection
        Set tblCells = tbl.getElementsByClassName("CAN BE BY TAG NAME TOO")

    Dim tableCell As MSHTML.HTMLTableCell

    For Each tableCell In tblCells
        Debug.Print tableCell.innerText
    Next tableCell
End Sub