我将用示例代码解释我所追求的内容。我的函数GetDox
看起来很接近,但仍然不完整。这是一个测试代码。
'test begin...
'<dox>
' <member type="Public Sub" name="Increment" return="void">
' <param type="Integer" name="nBase" out="true" />
' <param type="Integer" name="nStep" out="false" />
' <purpose>
' purpose here...
' </purpose>
' </member>
' <member ... />
'</dox>
'other comments here...
Public Sub Increment(nBase, nStep) 'some example content
nBase = nBase + nStep
End Sub
'<Unwonted_Item />
Dim source 'reading the same file just for simplification
With CreateObject("Scripting.FileSystemObject")
With .OpenTextFile(WScript.ScriptFullName, 1, False)
source = .ReadAll
End With
End With
result = GetDox(source)
WScript.Echo result 'display our result
Function GetDox(sCode) 'unfinished function
Dim regEx, Match, Matches, mVal, sEnd
sEnd = "</dox>" & vbNewLine
Set regEx = New RegExp
regEx.Pattern = "('<dox>\n|'\s*<.*)" 'my ugly pattern
regEx.IgnoreCase = True
regEx.Global = True
Set Matches = regEx.Execute(sCode)
For Each Match In Matches
mVal = Match.Value
mVal = Replace(mVal, vbCr, vbNewLine)
mVal = Right(mVal, Len(mVal) - 1)
GetDox = GetDox & mVal
If mVal = sEnd Then Exit For
Next
End Function
这就是我得到的:
<dox>
<member type="Public Sub" name="Increment" return="void">
<param type="Integer" name="nBase" out="true" />
<param type="Integer" name="nStep" out="false" />
<purpose>
</purpose>
</member>
<member ... />
</dox>
这就是我需要的:
<dox>
<member type="Public Sub" name="Increment" return="void">
<param type="Integer" name="nBase" out="true" />
<param type="Integer" name="nStep" out="false" />
<purpose>
purpose here...
</purpose>
</member>
<member ... />
</dox>
缺少“目的......”这一行,我知道整个RegExp.Pattern
语法很弱。我只想选择以<dox>
开头并以</dox>
结尾的整个内容,包括其中的所有内容,但我仍然坚持模式语法。
P.S。有了这么好的帮助(感谢所有人),这是我现在的工作职能:
Function GetDox(sCode)
GetDox = vbNullString
With New RegExp
.Pattern = "<dox>[\s\S]*?</dox>"
.IgnoreCase = True
.Global = False
With .Execute(sCode)
If .Count = 0 Then Exit Function
GetDox = .Item(0).Value
End With
.Pattern = "^'"
.Global = True
.Multiline = True
GetDox = .Replace(GetDox, "")
End With
End Function
答案 0 :(得分:2)
我首先删除主要的单引号:
regEx.Pattern = "^'"
regEx.Global = True
sCode = regEx.Replace(sCode, "")
然后提取XML文本:
regEx.Pattern = "<dox>[\s\S]*?</dox>"
regEx.Global = False
regEx.IgnoreCase = True
Set m = regEx.Execute(sCode)
If m.Count > 0 Then GetDox = m(0).Value
之后,您应该将XML读入DOM tree以进行进一步处理:
Set xml = CreateObject("Msxml2.DOMDocument.6.0")
xml.async = False
xml.loadXML result
如果您的XML位于单独的文件中,则应直接从文件加载XML并使用XPath表达式提取节点,如@FrankSchmitt在其评论中所建议的那样。
Set xml = CreateObject("Msxml2.DOMDocument.6.0")
xml.async = False
xml.load "C:\path\to\your.xml"
Set nodes = xml.selectNodes("//dox")
XML不是面向行的,不应该像它一样进行解析。如果你没有正确处理它,事情可能会以有趣的方式破裂。
答案 1 :(得分:1)