Question

我有一个充满结构化HTML内容的变量（一个网站内容），我只是想从一个名为“div”的文章获取内容，它看起来像;

     <article>
html stuff here html stuff here html stuff here html stuff here
html stuff here html stuff here html stuff here html stuff here

            </article>

我正在尝试：

Dim url

url="myUrl"

Set objXML = CreateObject("MSXML2.ServerXMLHTTP")

Set myDiv = New RegExp
With myDiv
    .Pattern    = "<article>.*</article>"
    .IgnoreCase = True
    .Global     = false
End With

objXML.Open "GET", url, False
    objXML.Send("")
    html= objXML.responseText


    Set objMatch = myDiv.Execute(html)

    for each x in objMatch
        WScript.Echo objMatch.Item(0)
    next




or .Pattern    = "#<article>([^<]*)</article>#'"
or .Pattern    = "<article>([^<]*)</article>'"

没有运气，有什么建议吗？

Answer 1

使用此Regex

(?<=\<article\>)([\s\S]*)(?=\<\/article>)

<强> REGEX101

示例（未经测试）

Set myDiv = New RegExp
With myDiv
    .Pattern    = "(?<=\<article\>)([\s\S]*)(?=\<\/article>)"
    .IgnoreCase = True
    .Global     = false
End With

如何在VBS中的变量内获取标记之间的内容

1 个答案: