正则表达式替换多个组

时间:2011-09-08 15:58:20

标签: c# .net regex replace

我想使用正则表达式替换具有相应替换字符串的多个组。

替换表:

  • &安培; - > __amp
  • # - > __hsh
  • 1 - > 5
  • 5 - > 6

例如,对于以下输入字符串

  

a1asda&安培; FJ#ahdk5adfls

相应的输出字符串是

  

a5asda__ampfj__hshahdk6adfls

有没有办法做到这一点?

4 个答案:

答案 0 :(得分:37)

给定一个定义替换词的字典:

IDictionary<string,string> map = new Dictionary<string,string>()
        {
           {"&","__amp"},
           {"#","__hsh"},
           {"1","5"},
           {"5","6"},
        };

您可以使用它来构建正则表达式,并为每个匹配形成替换:

var str = "a1asda&fj#ahdk5adfls";
var regex = new Regex(String.Join("|",map.Keys));
var newStr = regex.Replace(str, m => map[m.Value]);
// newStr = a5asda__ampfj__hshahdk6adfls

实例:http://rextester.com/rundotnet?code=ADDN57626

这使用Replace docs重载,允许您为替换指定lambda表达式。


在评论中已经指出,在其中具有正则表达式语法的查找模式将无法按预期工作。这可以通过使用Regex.Escape和上面代码的微小更改来克服:

var str = "a1asda&fj#ahdk5adfls";
var regex = new Regex(String.Join("|",map.Keys.Select(k => Regex.Escape(k))));
var newStr = regex.Replace(str, m => map[m.Value]);
// newStr = a5asda__ampfj__hshahdk6adfls

答案 1 :(得分:6)

如何使用string.Replace()

string foo = "a1asda&fj#ahdk5adfls"; 

string bar = foo.Replace("&","__amp")
                .Replace("#","__hsh")
                .Replace("5", "6")
                .Replace("1", "5");

答案 2 :(得分:3)

与Jamiec的答案类似,但这允许您使用与文本不完全匹配的正则表达式,例如\.不能与Jamiec的答案一起使用,因为你无法在字典中查找匹配。

此解决方案依赖于创建组,查找匹配的组,然后查找替换值。它更复杂,但更灵活。

首先使地图成为KeyValuePairs的列表

var map = new List<KeyValuePair<string, string>>();           
map.Add(new KeyValuePair<string, string>("\.", "dot"));

然后像这样创建你的正则表达式:

string pattern = String.Join("|", map.Select(k => "(" + k.Key + ")"));
var regex = new Regex(pattern, RegexOptions.Compiled);

然后匹配评估器变得有点复杂:

private static string Evaluator(List<KeyValuePair<string, string>> map, Match match)
{            
    for (int i = 0; i < match.Groups.Count; i++)
    {
        var group = match.Groups[i];
        if (group.Success)
        {
            return map[i].Value;
        }
    }

    //shouldn't happen
    throw new ArgumentException("Match found that doesn't have any successful groups");
}

然后像这样调用正则表达式替换:

var newString = regex.Replace(text, m => Evaluator(map, m))

答案 3 :(得分:2)

根据其他答案中的字典,您可以使用“聚合”将字典中的每个模式映射到替换。这将为您提供更大的灵活性,而另一个答案,因为您可以为每个模式提供不同的正则表达式选项。

例如,以下代码将“罗马化”希腊文本(https://en.wikipedia.org/w/index.php?title=Romanization_of_Greek&section=3#Modern_Greek,标准/联合国):

var map = new Dictionary<string,string>() {
    {"α[ύυ](?=[άαβγδέεζήηίΐϊιλμνόορύΰϋυώω])", "av"}, {"α[ύυ]", "af"}, {"α[ϊΐ]", "aï"}, {"α[ιί]", "ai"}, {"[άα]", "a"},
    {"β", "v"}, {"γ(?=[γξχ])", "n"}, {"γ", "g"}, {"δ", "d"},
    {"ε[υύ](?=[άαβγδέεζήηίΐϊιλμνόορύΰϋυώω])", "ev"}, {"ε[υύ]", "ef"}, {"ει", "ei"}, {"[εέ]", "e"}, {"ζ", "z"},
    {"η[υύ](?=[άαβγδέεζήηίΐϊιλμνόορύΰϋυώω])", "iv"}, {"η[υύ]", "if"}, {"[ηήιί]", "i"}, {"[ϊΐ]", "ï"},
    {"θ", "th"}, {"κ", "k"}, {"λ", "l"}, {"\\bμπ|μπ\\b", "b"}, {"μπ", "mb"}, {"μ", "m"}, {"ν", "n"},
    {"ο[ιί]", "oi"}, {"ο[υύ]", "ou"}, {"[οόωώ]", "o"}, {"ξ", "x"}, {"π", "p"}, {"ρ", "r"},
    {"[σς]", "s"}, {"τ", "t"}, {"[υύϋΰ]", "y"}, {"φ", "f"}, {"χ", "ch"}, {"ψ", "ps"}
};

var input = "Ο Καλύμνιος σφουγγαράς ψυθίρισε πως θα βουτήξει χωρίς να διστάζει."; 
map.Aggregate(input, (i, m) => Regex.Replace(i, m.Key, m.Value, RegexOptions.IgnoreCase));

返回(不修改“输入”变量:

"o kalymnios sfoungaras psythirise pos tha voutixei choris na distazei."

您当然可以使用以下内容:

foreach (var m in map) input = Regex.Replace(input, m.Key, m.Value, RegexOptions.IgnoreCase);

确实修改了“输入”变量。

您也可以添加它以提高性能:

var remap = new Dictionary<Regex, string>();
foreach (var m in map) remap.Add(new Regex(m.Key, RegexOptions.IgnoreCase | RegexOptions.Compiled), m.Value);

缓存或使静态重映射字典然后使用:

remap.Aggregate(input, (i, m) => m.Key.Replace(i, m.Value));