VBA:从HTML中检索标签值到MsgBox

时间:2015-09-01 14:58:18

标签: excel vba excel-vba

我有以下HTML代码,我想从以下位置检索数据:

<div class="span4">
    <div>
       <label for="Game_type">Portal Games</label>
         XXX
    </div>
    <div>
        <label for="Game_Reference">Game reference</label>
         22130903
    </div>
    <div>
        <label for="Release_Date">Release Date</label>
         2015-07-13
    </div>
    <div>
        <label for="Prise">Prise</label>
         USD 90,00
    </div>
    <div>
        <label for="Game_Rank">Game Rank</label>
          4
    </div>
</div>

我如何能够将所有这些标签值/至少一个值输入到MsgBox中? (稍后我会自己将它们输入Excel)

我尝试使用以下代码获取第一个值:

Dim IE As Object
Set IE = CreateObject("INTERNETEXPLORER.APPLICATION")
'page address is stated in code
IE.navigate "page name" 
IE.Visible = True

While IE.Busy
'Wait until IE is busy and loading page
DoEvents
Wend

Set gtype = IE.Document.getElementsByClassName("span4")(0).getElementsById("Game_type")
GtypeValue =  gtype.Value
MsgBox (GtypeValue)

End Sub

我收到了运行时错误&#34; 91:&#34;

  

未设置对象变量或With Block变量。

150904 希望最后一个问题,关于这个主题。默认代码看起来像

 strCont = objIE.Document.getElementsByClassName("span4")(0).innerHTML

但我希望有一个变量而不是&#34; span4&#34;,例如Dim1 =&#34; span4&#34; 我声明如下:

strCont = "objIE.Document.getElementsByClassName(" & Chr(34) & Dim1 & Chr(34) & ")(0).innerHTML" 

它不起作用,MsgBox中为空值。如何确保此sting将被计为稍后将在步骤中执行的确切代码:

Set objMatches = .Execute(strCont)

1 个答案:

答案 0 :(得分:0)

Why not to try regex for parsing?

Sub MsgGameType()
    Dim objIE As Object
    Dim strCont As String
    Dim objMatches As Object
    Dim objMatch As Object

    Set objIE = CreateObject("InternetExplorer.Application")
    'page address is stated in code
    objIE.Navigate "page name"
    objIE.Visible = True

    Do While objIE.Busy Or Not objIE.readyState = 4
        DoEvents
    Loop
    Do Until objIE.document.readyState = "complete"
        DoEvents
    Loop

    strCont = objIE.document.getElementsByClassName("span4")(0).innerHtml
    With CreateObject("VBScript.RegExp")
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "<div>\s*<label for="".*?"">(.*?)</label>\s*(.+?)\s*?</div>"
        Set objMatches = .Execute(strCont)
        For Each objMatch In objMatches
            MsgBox objMatch.SubMatches(0) & " = " & objMatch.SubMatches(1)
        Next
    End With
End Sub

See XHTML parsing with RegExp disclaimer.