Regex C#用多个结果替换匹配的字段

时间:2015-05-14 06:27:30

标签: c# regex replace

我有这样的文字:

iif(instr(|Wellington, New Zealand|,|,|)>0,|Wellington, New Zealand|,|Wellington, New Zealand| & |, | & |New Zealand|) & | to | & iif(instr(|Jeddah, Saudi Arabia|,|,|)>0,|Jeddah, Saudi Arabia|,|Jeddah, Saudi Arabia| & |, | & |Saudi Arabia|) & iif(|Jeddah, Saudi Arabia|=||,||,| via | & |Jeddah, Saudi Arabia|)

我可以使用正则表达式(下面)来获取|个字符之间所有元素的集合。我得到18场比赛,比赛#1为|,|

MatchCollection fields = Regex.Matches(str, @"\|.*?\|");

然后我想用~0~~1~~2~之类的占位符替换每个匹配项,直至~17~,以便我可以运行其余的码。我不在乎是否所有普通文本都被相同的占位符所取代,如果我全部使用18,那么会在占位符中留下空白。

我的问题是,我不能直接替换,因为在字符串|,|的这一部分中替换元素#1(|Jeddah, Saudi Arabia|,|,|)将替换它找到的第一个实例,其中正则表达式正确识别|Jeddah, Saudi Arabia|为一个匹配,|,|为另一个匹配。

我寻求的结果是:

iif(instr(~0~,~1~)>0,~0~,~0~ & ~2~ & ~3~) & ~4~ & iif(instr(~5~,~1~)>0,~5~,~5~ & ~2~ & ~6~) & iif(~5~=~7~,~7~,~8~ & ~5~)

随着越来越多的数字出现在我建立的数组中,我知道我有多少匹配。我保留原始值并稍后将其交换回来,这是最简单的部分。

4 个答案:

答案 0 :(得分:0)

\|,\|(?=[^(|]*(\|[^(|]*\|)*[^(|]*\))

您可以使用lookahead检查是否|,|被捕获以进行替换,在|之前不会留下任何流浪)。请参阅演示。

https://regex101.com/r/mT0iE7/14

答案 1 :(得分:0)

嗯......有点难以解释,但基本上是从你得到的东西建立......

鉴于匹配,我将它们添加到列表中并使用Distinct LINQ函数获取唯一匹配,并使用OrderBy LINQ函数将它们降级为最短到最短。然后循环生成 RouteMap 并替换原始字符串。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        var input = "iif(instr(|Wellington, New Zealand|,|,|)>0,|Wellington, New Zealand|,|Wellington, New Zealand| & |, | & |New Zealand|) & | to | & iif(instr(|Jeddah, Saudi Arabia|,|,|)>0,|Jeddah, Saudi Arabia|,|Jeddah, Saudi Arabia| & |, | & |Saudi Arabia|) & iif(|Jeddah, Saudi Arabia|=||,||,| via | & |Jeddah, Saudi Arabia|)";

        Console.WriteLine(input);

        var re = new Regex(@"\|.*?\|");

        var matches = re.Matches(input);

        var mz = new List<string>();

        foreach(Match m in matches) 
        {
            mz.Add(m.Groups[0].ToString());
        }

        var routeMap = mz.Distinct().OrderByDescending(n => n.Length).ToList(); //Get distinct, and sort it longest to shortest... need it this way or it won't do the replacement correctly.

        for (var i = 0; i < routeMap.Count; i++) 
        {
            input = input.Replace(routeMap[i], string.Format("~{0}~", i));          
        }

        Console.WriteLine(input);

        Console.WriteLine();
        Console.WriteLine("The route map replacement key:");
        var idx = 0;
        routeMap.ForEach(m => Console.WriteLine("{0}: {1}", idx++, m));

    }
}

https://dotnetfiddle.net/PIuuae

运行样本

答案 2 :(得分:0)

我想出了获得第二个输出选项的建议。

您可以使用MatchEvaluator将匹配项传递给单独的方法,并在该方法中增加“全局”计数器:

    public string ReplaceMatch(Match m)
    {
        i++;
        return "~" + i.ToString() + "~";

    }
    public static int i = -1;
    // ... then, in your calling method

    var txt = "iif(instr(|Wellington, New Zealand|,|,|)>0,|Wellington, New Zealand|,|Wellington, New Zealand| & |, | & |New Zealand|) & | to | & iif(instr(|Jeddah, Saudi Arabia|,|,|)>0,|Jeddah, Saudi Arabia|,|Jeddah, Saudi Arabia| & |, | & |Saudi Arabia|) & iif(|Jeddah, Saudi Arabia|=||,||,| via | & |Jeddah, Saudi Arabia|)";
    var fields = Regex.Matches(txt, @"\|.*?\|");
    var txt2 = Regex.Replace(txt, @"\|.*?\|", new MatchEvaluator(ReplaceMatch));

输出:

iif(instr(~0~,~1~)>0,~2~,~3~ & ~4~ & ~5~) & ~6~ & iif(instr(~7~,~8~)>0,~9~,~10~ & ~11~ & ~12~) & iif(~13~=~14~,~15~,~16~ & ~17~)

enter image description here

与这些占位符对应的匹配值保存在fields变量中,以便稍后您可以匹配它们。

编辑:对于选项1(这是编辑问题后的唯一选项),答案是创建一个包含不同项目的词典,并在替换方法中使用它:

var txt = "iif(instr(|Wellington, New Zealand|,|,|)>0,|Wellington, New Zealand|,|Wellington, New Zealand| & |, | & |New Zealand|) & | to | & iif(instr(|Jeddah, Saudi Arabia|,|,|)>0,|Jeddah, Saudi Arabia|,|Jeddah, Saudi Arabia| & |, | & |Saudi Arabia|) & iif(|Jeddah, Saudi Arabia|=||,||,| via | & |Jeddah, Saudi Arabia|)";
var fields = Regex.Matches(txt, @"\|.*?\|").Cast<Match>().Select(p=> p.Value).Distinct().Select((s, i) => new { s, i }).ToDictionary(x => x.s, x => x.i);
var txt3 = Regex.Replace(txt, @"\|.*?\|", m => string.Format("~{0}~", fields[m.Value]));

输出:

iif(instr(~0~,~1~)>0,~0~,~0~ & ~2~ & ~3~) & ~4~ & iif(instr(~5~,~1~)>0,~5~,~5~ & ~2~ & ~6~) & iif(~5~=~7~,~7~,~8~ & ~5~)

答案 3 :(得分:0)

我会使用一些lambda函数:

// This one gets the index from the list of matches
private static string LookupReplace(string text, List<string> newList)
{
    var result = "~" + newList.IndexOf(text).ToString() + "~";
    return result;
}

// This one just increments a global counter
private static string NumberedReplace()
{
    i++;
    return "~" + i.ToString() + "~";
}

public static int i = -1;

public static void Main()
{   
    string text = "iif(instr(|Wellington, New Zealand|,|,|)>0,|Wellington, New Zealand|,|Wellington, New Zealand| & |, | & |New Zealand|) & | to | & iif(instr(|Jeddah, Saudi Arabia|,|,|)>0,|Jeddah, Saudi Arabia|,|Jeddah, Saudi Arabia| & |, | & |Saudi Arabia|) & iif(|Jeddah, Saudi Arabia|=||,||,| via | & |Jeddah, Saudi Arabia|)";
    var re = new Regex(@"\|.*?\|");
    var newList = re.Matches(text)
                    .OfType<Match>()
                    .Select(m => m.Value)
                    .ToList();
    // First replace with index
    string result = re.Replace(text, x => LookupReplace(x.Value, newList));
    Console.WriteLine(result);

    // Second replace with counter
    result = re.Replace(text, x => NumberedReplace());
    Console.WriteLine(result);
}

ideone demo

每个替换的输出:

iif(instr(~0~,~1~)>0,~0~,~0~ & ~4~ & ~5~) & ~6~ & iif(instr(~7~,~1~)>0,~7~,~7~ & ~4~ & ~12~) & iif(~7~=~14~,~14~,~16~ & ~7~)
iif(instr(~0~,~1~)>0,~2~,~3~ & ~4~ & ~5~) & ~6~ & iif(instr(~7~,~8~)>0,~9~,~10~ & ~11~ & ~12~) & iif(~13~=~14~,~15~,~16~ & ~17~)