如何使用“MatchEvaluator”函数来减少编码?

时间:2016-08-14 03:55:50

标签: regex vb.net

我正在尝试搜索正则表达式pattern,如果匹配,则查找该模式的值是否存在于文件中<sec id="sec123">形式的任何标记内。如果是,我想用result1替换它。我认为可以使用MatchEvaluator函数完成,但我无法弄清楚如何应用它。

我是VB.NET的新手(以及一般的编程),真的不知道该怎么做。这是我到目前为止所尝试的:

Dim pattern As String="(?<=rid=\"sec)(\\d+)(?=\">)"
Dim r As Regex = New Regex(pattern)
Dim m As Match = r.Match(input)
If (m.Success) Then
    Dim x As String=" id=""sec"+ pattern +""""
    Dim r2 As Regex = New Regex(x)
    Dim m2 As Match = r2.Match(input)
    If (m2.Success) Then
        Dim tgPat AsString="<xref ref-type="section" rid=""sec + pattern +"">(\w+) (\d+)</xref>"
        Dim tgRep As String= "$1 $2"
        Dim tgReg As New Regex(tgPat)
        Dim result1 As String = tgReg.Replace(input, tgRep)
    Else
    EndIf
EndIf
Next

示例输入:

<sec id="sec1">
<p>"You fig. 23 did?" I <xref ref-type="section" rid="sec12">section 12</a> asked, surprised.</p>
<p>"There are always better terms <xref ref-type="section" rid="sec6">section 6</a>, Richard!" my mom said sharply.</p>
<p>I <xref ref-type="section" rid="sec2">section 2</a> stood. I <xref ref-type="section" rid="sec2">section 2</a> had to hurry if I <xref ref-type="section" rid="sec1">section 1</a> was going to get to work on time.
<fig id="fig4">
<caption><p>I'm confused</p></caption>
</fig> 
</p>
<p>Turning to face her, I <xref ref-type="section" rid="sec2">section 2</a> walked backward. "I"ve seriously got to get ready. Why don"t we get together for lunch and talk more then?"</p>
<sec id="sec2">
<p>"You fig. 23 can"t be""</p>
<p>I <xref ref-type="section" rid="sec4">section 4</a> adored the Art Deco elegance of the Chrysler Building. I <xref ref-type="section" rid="sec2">section 2</a> could pinpoint my place on the island in relation to the posit table 9ion of the Empire State Building.</p>
<p>I <xref ref-type="section" rid="sec1">section 1</a> felt Gideon before I <xref ref-type="section" rid="sec1">section 1</a> saw him, my entire body humming wit table 9h awareness as he stepped out of the Bentley, which had pulled up behind the Benz.</p>
</sec>
</sec>

预期产出:

<sec id="sec1">
<p>"You fig. 23 did?" I **section 12** asked, surprised.</p>
<p>"There are always better terms **section 6**, Richard!" my mom said sharply.</p>
<p>I <xref ref-type="section" rid="sec2">section 2</a> stood. I <xref ref-type="section" rid="sec2">section 2</a> had to hurry if I <xref ref-type="section" rid="sec1">section 1</a> was going to get to work on time.
<fig id="fig4">
<caption><p>I'm confused</p></caption>
</fig> 
</p>
<p>Turning to face her, I <xref ref-type="section" rid="sec2">section 2</a> walked backward. "I"ve seriously got to get ready. Why don"t we get together for lunch and talk more then?"</p>
<sec id="sec2">
<p>"You fig. 23 can"t be""</p>
<p>I **section 4** adored the Art Deco elegance of the Chrysler Building. I <xref ref-type="section" rid="sec2">section 2</a> could pinpoint my place on the island in relation to the posit table 9ion of the Empire State Building.</p>
<p>I <xref ref-type="section" rid="sec1">section 1</a> felt Gideon before I <xref ref-type="section" rid="sec1">section 1</a> saw him, my entire body humming wit table 9h awareness as he stepped out of the Bentley, which had pulled up behind the Benz.</p>
</sec>
</sec>

1 个答案:

答案 0 :(得分:0)

我用一个使用XElement的解决方案替换了我的初始解决方案,因为你的数据并不像我最初想的那么简单。

    Dim input = XElement.Parse(data)
    Dim sections = input.Descendants("sec").ToDictionary(Function(s) s.@id, Function(s) s)
    Dim xrefs = input.Descendants("xref").ToLookup(Function(s) s.@rid, Function(s) s)

    For Each group In xrefs
        Dim section As XElement
        If sections.TryGetValue(group.Key, section) Then
            For Each xref In group
                xref.ReplaceWith(section)
            Next
        End If
    Next
    Dim output = input.ToString()

我认为这就是你所追求的,虽然我不相信你的数据,因为Section2似乎是递归的。尝试看看你的想法无论如何。步骤:

  1. 解析xml
  2. 提取部分并按字典中的ID键入
  3. 提取外部参照(具有相同密钥的多个外部参照号)将它们按ID添加到查找
  4. 对于每个外部参照检查它是否有一个部分。如果确实如此,则替换它。
  5. 提取结果。