如何在VBS中的变量内获取标记之间的内容

时间:2014-05-24 20:19:38

标签: html regex vbscript

我有一个充满结构化HTML内容的变量(一个网站内容),我只是想从一个名为“div”的文章获取内容,它看起来像;

     <article>
html stuff here html stuff here html stuff here html stuff here
html stuff here html stuff here html stuff here html stuff here

            </article>

我正在尝试:

Dim url

url="myUrl"

Set objXML = CreateObject("MSXML2.ServerXMLHTTP")

Set myDiv = New RegExp
With myDiv
    .Pattern    = "<article>.*</article>"
    .IgnoreCase = True
    .Global     = false
End With

objXML.Open "GET", url, False
    objXML.Send("")
    html= objXML.responseText


    Set objMatch = myDiv.Execute(html)

    for each x in objMatch
        WScript.Echo objMatch.Item(0)
    next




or .Pattern    = "#<article>([^<]*)</article>#'"
or .Pattern    = "<article>([^<]*)</article>'"

没有运气,有什么建议吗?

1 个答案:

答案 0 :(得分:1)

使用此Regex

(?<=\<article\>)([\s\S]*)(?=\<\/article>)

<强> REGEX101

示例(未经测试)

Set myDiv = New RegExp
With myDiv
    .Pattern    = "(?<=\<article\>)([\s\S]*)(?=\<\/article>)"
    .IgnoreCase = True
    .Global     = false
End With