将IE自动化代码更改为XML时出错

时间:2017-06-28 00:42:33

标签: excel vba excel-vba internet-explorer xml-parsing

我最近开始使用XML自动化,在更改了一些基本的IE自动化代码之后,我似乎遇到了错误。这是HTML:

<tbody>
    <tr class="group-2 first">
    <td class="date-col">
        <a href="/stats/matches/mapstatsid/48606/teamone-vs-merciless">
            <div class="time" data-time-format="d/M/yy" data-unix="1498593600000">27/6/17</div>
        </a>
    </td>
    ......SOME MORE HTML HERE......
    </tr>
......SOME MORE HTML HERE......
</tbody>

这是我在Excel VBA中使用的代码:

Sub readData()

Dim XMLPage As New MSXML2.XMLHTTP60
Dim html As New MSHTML.HTMLDocument

XMLPage.Open "GET", "https://www.hltv.org/stats/matches", False
XMLPage.send

If XMLPage.Status <> 200 Then MsgBox XMLPage.statusText
html.body.innerHTML = XMLPage.responseText


For Each profile In html.getElementsByTagName("tbody")(0).Children
    Debug.Print profile.getElementsByClassName("date-col")(0).getElementsByTagName("a")(0).getAttribute("href") 'Run time error '438' here
Next

End Sub

我在调试打印代码中收到运行时错误'438'。上课时似乎正在发生,但我不确定为什么。如果我使用它,它可以正常工作:

Debug.Print profile.innertext

4 个答案:

答案 0 :(得分:4)

为我工作:

Sub readData()

    Dim XMLPage As New MSXML2.XMLHTTP60
    Dim html As New MSHTML.HTMLDocument, links, a, i

    XMLPage.Open "GET", "https://www.hltv.org/stats/matches", False
    XMLPage.send

    If XMLPage.Status <> 200 Then MsgBox XMLPage.statusText
    html.body.innerHTML = XMLPage.responseText

    Set links = html.querySelectorAll("td.date-col > a")
    Debug.Print links.Length

    For i = 0 To links.Length - 1
        Debug.Print links(i).href
    Next

    Set links = Nothing
    Set html = Nothing

End Sub

当我使用For Each循环links集合时,我会可靠地崩溃,所以我会留在显示的循环中

答案 1 :(得分:2)

profile引用一行,profile.cells(0)将引用该行中的第一列。所以试试......

profile.cells(0).getElementsByTagName("a")(0).getAttribute("href")

此外,profile应声明为HTMLTableRow

答案 2 :(得分:2)

您使用的网址不提供有效的XML,但可以通过一些简单的正则表达式替换来恢复。一旦我们有了一些有效的XML,我们就可以将它加载到DOM文档中并使用XPath根据需要选择节点:

Option Explicit

'Add references to:
' - MSXML v3
' - Microsoft VBScript Regular Expressions 5.5

Sub test()

  Const START_MARKER As String = "<table class=""stats-table matches-table"">"
  Const END_MARKER As String = "</table>"

  With New MSXML2.XMLHTTP
    .Open "GET", "https://www.hltv.org/stats/matches", False
    .send
    If .Status = 200 Then

      'The HTML isn't valid XHTML, so we can't just use the http.XMLResponse DOMDocument
      'Let's extract the HTML table

      Dim tableStart As Long
      tableStart = InStr(.responseText, START_MARKER)

      Dim tableEnd As Long
      tableEnd = InStr(tableStart, .responseText, END_MARKER)

      Dim tableHTML As String
      tableHTML = Mid$(.responseText, tableStart, tableEnd - tableStart + Len(END_MARKER))

      'The HTML table has invalid img tags (let's add a closing tag with some regex)
      With New RegExp
        .Global = True
        .Pattern = "(\<img [\W\w]*?)"">"
        Dim tableXML As String
        tableXML = .Replace(tableHTML, "$1"" />")
      End With

      'And load an XML document from the cleaned up HTML fragment
      Dim doc As MSXML2.DOMDocument
      Set doc = New MSXML2.DOMDocument
      doc.LoadXML tableXML

    End If
  End With

  If Not doc Is Nothing Then

    'Use XPath to select the nodes we need
    Dim nodes As MSXML2.IXMLDOMSelection
    Set nodes = doc.SelectNodes("//td[@class='date-col']/a/@href")

   'Enumerate the URLs
    Dim node As IXMLDOMAttribute
    For Each node In nodes
      Debug.Print node.nodeTypedValue
    Next node

  End If

End Sub

输出:

/stats/matches/mapstatsid/48606/teamone-vs-merciless
/stats/matches/mapstatsid/48607/merciless-vs-teamone
/stats/matches/mapstatsid/48608/merciless-vs-teamone
/stats/matches/mapstatsid/48600/wysix-vs-fnatic-academy
/stats/matches/mapstatsid/48602/skitlite-vs-nexus
/stats/matches/mapstatsid/48604/extatus-vs-forcebuy
/stats/matches/mapstatsid/48605/extatus-vs-forcebuy
/stats/matches/mapstatsid/48599/planetkey-vs-gatekeepers
/stats/matches/mapstatsid/48603/gatekeepers-vs-planetkey
/stats/matches/mapstatsid/48595/wysix-vs-gambit
/stats/matches/mapstatsid/48596/kinguin-vs-playing-ducks
/stats/matches/mapstatsid/48597/spirit-academy-vs-tgfirestorm
/stats/matches/mapstatsid/48601/spirit-academy-vs-tgfirestorm
/stats/matches/mapstatsid/48593/fnatic-academy-vs-gambit
/stats/matches/mapstatsid/48594/alternate-attax-vs-nexus
/stats/matches/mapstatsid/48590/pro100-vs-playing-ducks
/stats/matches/mapstatsid/48583/extatus-vs-ex-fury
/stats/matches/mapstatsid/48589/extatus-vs-ex-fury
/stats/matches/mapstatsid/48584/onlinerol-vs-forcebuy
/stats/matches/mapstatsid/48591/forcebuy-vs-onlinerol
/stats/matches/mapstatsid/48581/epg-vs-veni-vidi-vici
/stats/matches/mapstatsid/48588/epg-vs-veni-vidi-vici
/stats/matches/mapstatsid/48592/veni-vidi-vici-vs-epg
/stats/matches/mapstatsid/48582/log-vs-gatekeepers
/stats/matches/mapstatsid/48586/gatekeepers-vs-log
/stats/matches/mapstatsid/48580/spraynpray-vs-epg
/stats/matches/mapstatsid/48579/quantum-bellator-fire-vs-spraynpray
/stats/matches/mapstatsid/48571/noxide-vs-masterminds
/stats/matches/mapstatsid/48572/athletico-vs-legacy
/stats/matches/mapstatsid/48578/node-vs-avant
/stats/matches/mapstatsid/48573/funky-monkeys-vs-grayhound
/stats/matches/mapstatsid/48574/grayhound-vs-funky-monkeys
/stats/matches/mapstatsid/48575/hegemonyperson-vs-eclipseo
/stats/matches/mapstatsid/48577/eclipseo-vs-hegemonyperson
/stats/matches/mapstatsid/48566/masterminds-vs-tainted-black
/stats/matches/mapstatsid/48562/grayhound-vs-legacy
/stats/matches/mapstatsid/48563/noxide-vs-riotous-raccoons
/stats/matches/mapstatsid/48564/avant-vs-dark-sided
/stats/matches/mapstatsid/48565/avant-vs-dark-sided
/stats/matches/mapstatsid/48567/eclipseo-vs-uya
/stats/matches/mapstatsid/48568/uya-vs-eclipseo
/stats/matches/mapstatsid/48560/uya-vs-new4
/stats/matches/mapstatsid/48561/new4-vs-uya
/stats/matches/mapstatsid/48559/jaguar-sa-vs-miami-flamingos
/stats/matches/mapstatsid/48558/spartak-vs-binary-dragons
/stats/matches/mapstatsid/48557/kungar-vs-spartak
/stats/matches/mapstatsid/48556/igamecom-vs-fragsters
/stats/matches/mapstatsid/48554/nordic-warthogs-vs-aligon
/stats/matches/mapstatsid/48555/binary-dragons-vs-kungar
/stats/matches/mapstatsid/48550/havu-vs-rogue-academy

答案 3 :(得分:0)

查看MSHTML.HTMLDocument引用,没有getElementsByClassName方法。

您将需要遍历您选择的tbody中的每一行,然后获取该行中的第一个td,然后获取该td中的第一个链接并从中读取href属性。您可以交替比较td的class属性,但因为它是行中的第一个元素,所以不需要这样做。