Question

我正在尝试使用VBA抓取页面。我知道如何通过 id class 和 tag 名称来获取元素。但现在我遇到了这个标签

<!-- <b>IE CODE : 3407004044</b> -->

现在在互联网上搜索后，我知道这是HTML中的注释，但我无法找到的是该元素的标记名称，如果它完全符合标记的话。我应该使用

documnet.getelementsbytagname("!") ?

如果没有，我还能如何提取这些评论？

修改我在 tr 元素中有一堆 td 元素，我想提取IE Code : 3407004044 下面是一组更大的HTML代码：

<tr align="left">
    <td width="50%" class="subhead1">                                                           

    ' this is the part that I want to extract
    <!-- <b>IE CODE : 3108011111</b> -->                                
    </td>
    <td rowspan="9" valign="top">
    <span id="datalist1_ctl00_lbl_p"></span>
    </td>
</tr>

谢谢！

Answer 1

您可以使用XPath：

substring-before(substring-after(//tr//comment(), "<b>"), "</b>")

获取所需数据

Answer 2

尝试这样的尝试，如果你进一步修复它会有效：

Option Explicit

Public Sub TestMe()

    Dim myString    As String
    Dim cnt         As Long
    Dim myArr       As Variant

    myString = "<!-- <b>IE CODE : Koj sega e</b> -->blas<hr>My Website " & _
                    "is here<B><B><B><!-- <b>IE CODE : nomer </b> -->" & _
                    "is here<B><B><B><!-- <b>IE CODE : 1? </b> -->"

    myString = Replace(myString, "-->", "<!--")
    myArr = Split(myString, "<!--")

    For cnt = LBound(myArr) To UBound(myArr)
        If cnt Mod 2 = 1 Then Debug.Print myArr(cnt)
    Next cnt

End Sub

这就是你得到的：

 <b>IE CODE : Koj sega e</b> 
 <b>IE CODE : nomer </b> 
 <b>IE CODE : 1? </b>

这个想法如下：

将-->替换为<!--
按<!--
从数组中获取每秒值

有一些可能的情况，它不起作用，例如如果您在文本中的某处写了-->或<!--，但在一般情况下它应该没问题。

如何使用VBA在<！ - - >之间提取某些内容？

2 个答案: