使用Regex.Replace时如何处理多个匹配项

时间:2012-12-19 18:45:40

标签: .net regex vb.net

我有一个正则表达式,导致多个匹配。示例数据集将是CSV文件,每行都是单独匹配:

product,color,type,shape,size
apple,green,fruit,round,large
banana,yellow,fruit,long,large
cherry,red,fruit,round,small

匹配#1将是苹果,绿色,水果,圆形,大型,匹配#2将是香蕉,黄色,水果,长,大等。

所以我的问题是,当使用RegEx.Replace时,我如何指定'起始'匹配(例如,在这种情况下,我想从第二场比赛开始),如何指定之后的匹配#那?这只是一个例子,在其他情况下,我想从匹配#4等开始。

看起来RegEx.Replace支持这样的内容,但我正在寻找适用于我的场景的更好的示例。

我试过了:

Dim r As New RegEx(pattern)
result = r.Replace(input, replace, 1, 2)

replace是一个包含捕获值的字符串(在我的情况下为1美元),但我没有看到任何不同,仍然在1个字符串中获取所有匹配。

有什么建议吗?我希望也许像获得#匹配一样简单,只需使用For循环。

3 个答案:

答案 0 :(得分:1)

看看Regex.Replace(string, string, MatchEvaluator)

http://msdn.microsoft.com/en-us/library/ht1sxswy.aspx

这应该允许你传递一个MatchEvaluator来检查特定匹配的索引,所以在这种情况下你可以查找index == 1

答案 1 :(得分:1)

我不会仅使用Regex来识别文本中的行。使用

读取CSV文件
Dim lines As String()

lines = File.ReadAllLines("path of the CSV file")

然后像这样循环

For i As Integer = starting_match To last_match
    lines(i) = lines(i).Replace("old","new")
Next

将这些行与

放在一起
Dim result As String
result = String.Join(System.Environment.NewLine, lines)

<强>更新

混淆来自于Replace方法中的起始位置表示起始字符位置而不是起始匹配索引。因此我建议使用这种扩展方法

<System.Runtime.CompilerServices.Extension> _
Public Shared Function ReplaceMatches(regex As Regex,
                                      input As String, replacement As String, 
                                      countMatches As Integer, startAtMatch As Integer
                                     ) As String
    Dim matches As MatchCollection = regex.Matches(input)
    If startAtMatch >= matches.Count Then
        Return input
    End If
    Dim skippedMatch As Match = matches(startAtMatch - 1)
    Dim startAtCharacterPosition As Integer = skippedMatch.Index + skippedMatch.Length
    Return regex.Replace(input, replacement, countMatches, startAtCharacterPosition)
End Function

现在您可以替换为:

Dim input As String = "aaa bbb ccc ddd eee fff"
Dim startAtMatch As Integer = 2 ' ccc
Dim countMatches As Integer = 3

Dim regex = New Regex("\w+")
Dim result As String = regex.ReplaceMatches(input, "XX", countMatches, startAtMatch)
Console.WriteLine(result) ' --> "aaa bbb XX XX XX fff"

(使用devloperFusion从C#转换为VB的示例)

答案 2 :(得分:-2)

以下代码可能对您有所帮助

http://msdn.microsoft.com/en-us/library/ms149475.aspx?cs-save-lang=1&cs-lang=vb#code-snippet-3

Imports System.Collections

Imports System.Text.RegularExpressions

Module Example

    Public Sub Main()
        Dim words As String = "letter alphabetical missing lack release " + _
                              "penchant slack acryllic laundry cease"
        Dim pattern As String = "\w+  # Matches all the characters in a word."
        Dim evaluator As MatchEvaluator = AddressOf WordScrambler
        Console.WriteLine("Original words:")
        Console.WriteLine(words)
        Console.WriteLine("Scrambled words:")
        Console.WriteLine(Regex.Replace(words, pattern, evaluator,
                                        RegexOptions.IgnorePatternWhitespace))
    End Sub

    Public Function WordScrambler(ByVal match As Match) As String
        Dim arraySize As Integer = match.Value.Length - 1
        ' Define two arrays equal to the number of letters in the match. 
        Dim keys(arraySize) As Double
        Dim letters(arraySize) As Char

        ' Instantiate random number generator' 
        Dim rnd As New Random()

        For ctr As Integer = 0 To match.Value.Length - 1
            ' Populate the array of keys with random numbers.
            keys(ctr) = rnd.NextDouble()
            ' Assign letter to array of letters.
            letters(ctr) = match.Value.Chars(ctr)
        Next
        Array.Sort(keys, letters, 0, arraySize, Comparer.Default)
        Return New String(letters)
    End Function

End Module

' The example displays output similar to the following: 
'    Original words: 
'    letter alphabetical missing lack release penchant slack acryllic laundry cease 
'     
'    Scrambled words: 
'    etlert liahepalbcat imsgsni alkc ereelsa epcnnaht lscak cayirllc alnyurd ecsae