我有一个充满结构化HTML内容的变量(一个网站内容),我只是想从一个名为“div”的文章获取内容,它看起来像;
<article>
html stuff here html stuff here html stuff here html stuff here
html stuff here html stuff here html stuff here html stuff here
</article>
我正在尝试:
Dim url
url="myUrl"
Set objXML = CreateObject("MSXML2.ServerXMLHTTP")
Set myDiv = New RegExp
With myDiv
.Pattern = "<article>.*</article>"
.IgnoreCase = True
.Global = false
End With
objXML.Open "GET", url, False
objXML.Send("")
html= objXML.responseText
Set objMatch = myDiv.Execute(html)
for each x in objMatch
WScript.Echo objMatch.Item(0)
next
or .Pattern = "#<article>([^<]*)</article>#'"
or .Pattern = "<article>([^<]*)</article>'"
没有运气,有什么建议吗?
答案 0 :(得分:1)
使用此Regex
(?<=\<article\>)([\s\S]*)(?=\<\/article>)
<强> REGEX101 强>
示例(未经测试)
Set myDiv = New RegExp
With myDiv
.Pattern = "(?<=\<article\>)([\s\S]*)(?=\<\/article>)"
.IgnoreCase = True
.Global = false
End With