使用Regex替换模式替换字符串

时间:2016-04-14 02:00:47

标签: c# regex string replace

基本上我正在处理CSV文件并在C#中逐行读取。我有一个字符串输入(一行),并试图找到一个正则表达式模式,并使用另一个正则表达式模式替换它,但结果不是我所期望的。

var input = "\"efgh ,ijkl123,\",abcd ,  \"efgh ,ijkl123,\",mnop456 \"efgh ,ijkl123,\"";

在输出中,我需要用分号替换双引号之间的内部逗号,其中双引号在逗号之间。

双引号和外部逗号之间的

(双引号对外的逗号)它只能是空格。

所以我希望输出为:"efgh ;ijkl123,",abcd , "efgh ;ijkl123,",mnop456 "efgh ,ijkl123,"

我的代码:

var pattern = @".*,\s*""(.*,+.*)+""\s*,.*";
var replacePattern = @".*,\s*""(.*;+.*)+""\s*,.*";
if (Regex.IsMatch(input, pattern))
{
    var output = Regex.Replace(input, pattern, replacePattern);
}

但是运行我的代码,输出是:。,\ s “(。; +。)+”\ s *,。*这是replacePattern。

编辑按预期更多输入样本和输出:

  1. 输入abcd , "efgh ,ijkl123,",mnop456

    输出abcd , "efgh ;ijkl123;",mnop456

  2. 输入"efgh ,ijkl123,",abcd , "efgh ,ijkl123,",mnop456 "efgh ,ijkl123,"

    输出"efgh ;ijkl123;",abcd , "efgh ;ijkl123;",mnop456 "efgh ,ijkl123,"

  3. 输入,"efgh ,ijkl123,",abcd" , "efgh ijkl123,",mnop456 "efgh ,ijkl123,","efgh ,ijkl123,"mnop456

    输出,"efgh ;ijkl123;",abcd" , "efgh ijkl123;",mnop456 "efgh ,ijkl123,","efgh ,ijkl123,"mnop456

  4. 输入,"efgh" ,ijkl123,",abcd" , "efgh ijkl123,",mnop456 "efgh ,ijkl123,","efgh ,ijkl123,"mnop456

    输出,"efgh" ,ijkl123,";abcd" , "efgh ijkl123;",mnop456 "efgh ,ijkl123,","efgh ,ijkl123,"mnop456

  5. 输入efgh ,ijkl123,",abcd , "efgh ,ijkl123,",mnop456 "efgh ,ijkl123,"

    输出efgh ,ijkl123,",abcd , "efgh ;ijkl123;",mnop456 "efgh ,ijkl123,"

2 个答案:

答案 0 :(得分:1)

嗯,这有点棘手,我相信有人会建议一个比我更好的正则表达式。假设你输入的文字是:

"efgh ,ijkl123,",abcd ,  "efgh ,ijkl123,",mnop456 "efgh ,ijkl123,"

您可以尝试:

var data = "\"efgh ,ijkl123,\",abcd ,  \"efgh ,ijkl123,\",mnop456 \"efgh ,ijkl123,\"";

var rx = @"(?<=(^|,[ \t]*))\""[^\""\n]+\""(?=[ \t]*(,|$))";

var matches = Regex.Matches (data, rx);

foreach (Match match in matches) {
    data = new Regex (match.Value).
        Replace(data, match.Value.Replace (',', ';'), 1);
}

Console.WriteLine (data);

会发出:

"efgh ;ijkl123;",abcd ,  "efgh ;ijkl123;",mnop456, "efgh ,ijkl123," 

上面的代码实际上是用,半冒号的双引号替换所有;个逗号。

答案 1 :(得分:0)

不确定它是否非常高效,但有效。欢迎建议进一步改进。

string  input = "\"efgh ,ijkl123,\",abcd ,  \"efgh ,ijkl123,\",mnop456 \"efgh ,ijkl123,\"";; 

Regex.Matches(input, "\"([^\"]*)\"(,)") // Extract string between quotes followed by ','.
.Cast<Match>()
    .ToList()
    .ForEach(m=> input = input.Replace(m.Value, m.Value.Replace(",",";")) // for each match replace with ';' inserted match.
                              .Replace(";\";",",\","));  // a hack, should have done it better

输出:

"efgh ;ijkl123,",abcd ,  "efgh ;ijkl123,",mnop456 "efgh ,ijkl123,"

工作Demo