我在Excel VBA中使用Microsoft正则表达式引擎。我对正则表达式很新,但我现在有一个模式。我需要扩展它,我遇到了麻烦。到目前为止,这是我的代码:
Sub ImportFromDTD()
Dim sDTDFile As Variant
Dim ffile As Long
Dim sLines() As String
Dim i As Long
Dim Reg1 As RegExp
Dim M1 As MatchCollection
Dim M As Match
Dim myRange As Range
Set Reg1 = New RegExp
ffile = FreeFile
sDTDFile = Application.GetOpenFilename("DTD Files,*.XML", , _
"Browse for file to be imported")
If sDTDFile = False Then Exit Sub '(user cancelled import file browser)
Open sDTDFile For Input Access Read As #ffile
Lines = Split(Input$(LOF(ffile), #ffile), vbNewLine)
Close #ffile
Cells(1, 2) = "From DTD"
J = 2
For i = 0 To UBound(Lines)
'Debug.Print "Line"; i; "="; Lines(i)
With Reg1
'.Pattern = "(\<\!ELEMENT\s)(\w*)(\s*\(\#\w*\)\s*\>)"
.Pattern = "(\<\!ELEMENT\s)(\w*)(\s*\(\#\w*\)\s*\>)"
.Global = True
.MultiLine = True
.IgnoreCase = False
End With
If Reg1.Test(Lines(i)) Then
Set M1 = Reg1.Execute(Lines(i))
For Each M In M1
sExtract = M.SubMatches(1)
sExtract = Replace(sExtract, Chr(13), "")
Cells(J, 2) = sExtract
J = J + 1
'Debug.Print sExtract
Next M
End If
Next i
Set Reg1 = Nothing
End Sub
目前,我正在匹配一组这样的数据:
<!ELEMENT DealNumber (#PCDATA) >
并提取Dealnumber但是现在,我需要在这样的数据上添加另一个匹配:
<!ELEMENT DealParties (DealParty+) >
并在没有Parens和+
的情况下提取Dealparty我一直在使用它作为参考,它很棒,但我仍然有点困惑。 How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
修改
我遇到了一些必须匹配的新方案。
Extract Deal
<!ELEMENT Deal (DealNumber,DealType,DealParties) >
Extract DealParty the ?,CR are throwing me off
<!ELEMENT DealParty (PartyType,CustomerID,CustomerName,CentralCustomerID?,
LiabilityPercent,AgentInd,FacilityNo?,PartyReferenceNo?,
PartyAddlReferenceNo?,PartyEffectiveDate?,FeeRate?,ChargeType?) >
Extract Deals
<!ELEMENT Deals (Deal*) >
答案 0 :(得分:3)
查看您的模式,您有太多的捕获组。您只想捕获PCDATA
和DealParty
。尝试将模式更改为:
With Reg1
.Pattern = "\<!ELEMENT\s+\w+\s+\(\W*(\w+)\W*\)"
.Global = True
.MultiLine = True
.IgnoreCase = False
End With
这里是存根:Regex101。
答案 1 :(得分:1)
您可以使用此Regex
模式;
.Pattern = "\<\!ELEMENT\s+(\w+)\s+\((#\w+|(\w+)\+)\)\s+\>"
(#\w+|(\w+)\+)
匹配
#A-Z0-9
一个-Z0-9 +
在括号内。
即匹配
(#PCDATA)
(DealParty +)
验证整个字符串
下面编辑过的代码 - 注意子匹配现在是M.submatches(0)
Sub ImportFromDTD()
Dim sDTDFile As Variant
Dim ffile As Long
Dim sLines() As String
Dim i As Long
Dim Reg1 As RegExp
Dim M1 As MatchCollection
Dim M As Match
Dim myRange As Range
Set Reg1 = New RegExp
J = 1
strIn = "<!ELEMENT Deal12Number (#PCDATA) > <!ELEMENT DealParties (DealParty+) >"
With Reg1
.Pattern = "\<\!ELEMENT\s+(\w+)\s+\((#\w+|(\w+)\+)\)\s+\>"
.Global = True
.MultiLine = True
.IgnoreCase = False
End With
If Reg1.Test(strIn) Then
Set M1 = Reg1.Execute(strIn)
For Each M In M1
sExtract = M.SubMatches(2)
If Len(sExtract) = 0 Then sExtract = M.SubMatches(0)
sExtract = Replace(sExtract, Chr(13), "")
Cells(J, 2) = sExtract
J = J + 1
Next M
End If
Set Reg1 = Nothing
End Sub