我需要在字符串中获取第一个p标记的内容(但没有实际的标记)。
实施例
<h1>I don't want the title</h1>
<p>This is the text I want</p>
<p>I don't want this</p>
<p>I also don't want this</p>
我想我需要找到其他所有内容并用什么来代替它?但是我如何创建正则表达式呢?
答案 0 :(得分:1)
尝试这样的事情:
Set fso = CreateObject("Scripting.FileSystemObject")
Set html = CreateObject("HTMLFile")
html.write fso.OpenTextFile("C:\path\to\your.html").ReadAll
Set p = html.getElementsByTagName("p")
WScript.Echo p(0).innerText
答案 1 :(得分:1)
使用此模式捕获您想要的内容
^[\s\S]*?<p>([^<>]*?)<\/p>
^ # Start of string/line
[\s\S] # Character Class [\s\S]
*? # (zero or more)(lazy)
<p> # "<p>"
( # Capturing Group (1)
[^<>] # Character not in [^<>]
*? # (zero or more)(lazy)
) # End of Capturing Group (1)
<\/p> # "<\/p>"
或使用此模式匹配其他所有内容并替换为任何内容
^[\s\S]*?<p>|<\/p>[\s\S]*$
^ # Start of string/line
[\s\S] # Character Class [\s\S]
*? # (zero or more)(lazy)
<p> # "<p>"
| # OR
< # "<"
\/ # "/"
p> # "p>"
[\s\S] # Character Class [\s\S]
* # (zero or more)(greedy)
$ # End of string/line
答案 2 :(得分:0)
您可以使用xpath表达式正确执行此操作:
//p[1]/text()
改编自Navigating XML nodes in VBScript, for a Dummy:
Set objDoc = CreateObject("MSXML.DOMDocument")
objDoc.Load "C:\Temp\Test.xml"
' Find a particular element using XPath:
Set objNode = objDoc.selectSingleNode("//p[1]/text()")
MsgBox objNode.getAttribute("value")