基本上我正在处理CSV文件并在C#中逐行读取。我有一个字符串输入(一行),并试图找到一个正则表达式模式,并使用另一个正则表达式模式替换它,但结果不是我所期望的。
var input = "\"efgh ,ijkl123,\",abcd , \"efgh ,ijkl123,\",mnop456 \"efgh ,ijkl123,\"";
在输出中,我需要用分号替换双引号之间的内部逗号,其中双引号在逗号之间。
双引号和外部逗号之间的(双引号对外的逗号)它只能是空格。
所以我希望输出为:"efgh ;ijkl123,",abcd , "efgh ;ijkl123,",mnop456 "efgh ,ijkl123,"
我的代码:
var pattern = @".*,\s*""(.*,+.*)+""\s*,.*";
var replacePattern = @".*,\s*""(.*;+.*)+""\s*,.*";
if (Regex.IsMatch(input, pattern))
{
var output = Regex.Replace(input, pattern, replacePattern);
}
但是运行我的代码,输出是:。,\ s “(。; +。)+”\ s *,。*这是replacePattern。
编辑按预期更多输入样本和输出:
输入abcd , "efgh ,ijkl123,",mnop456
输出abcd , "efgh ;ijkl123;",mnop456
输入"efgh ,ijkl123,",abcd , "efgh ,ijkl123,",mnop456 "efgh
,ijkl123,"
输出"efgh ;ijkl123;",abcd , "efgh ;ijkl123;",mnop456 "efgh
,ijkl123,"
输入,"efgh ,ijkl123,",abcd" , "efgh ijkl123,",mnop456 "efgh
,ijkl123,","efgh ,ijkl123,"mnop456
输出,"efgh ;ijkl123;",abcd" , "efgh ijkl123;",mnop456 "efgh
,ijkl123,","efgh ,ijkl123,"mnop456
输入,"efgh" ,ijkl123,",abcd" , "efgh ijkl123,",mnop456 "efgh
,ijkl123,","efgh ,ijkl123,"mnop456
输出,"efgh" ,ijkl123,";abcd" , "efgh ijkl123;",mnop456 "efgh
,ijkl123,","efgh ,ijkl123,"mnop456
输入efgh ,ijkl123,",abcd , "efgh ,ijkl123,",mnop456 "efgh
,ijkl123,"
输出efgh ,ijkl123,",abcd , "efgh ;ijkl123;",mnop456 "efgh
,ijkl123,"
答案 0 :(得分:1)
嗯,这有点棘手,我相信有人会建议一个比我更好的正则表达式。假设你输入的文字是:
"efgh ,ijkl123,",abcd , "efgh ,ijkl123,",mnop456 "efgh ,ijkl123,"
您可以尝试:
var data = "\"efgh ,ijkl123,\",abcd , \"efgh ,ijkl123,\",mnop456 \"efgh ,ijkl123,\"";
var rx = @"(?<=(^|,[ \t]*))\""[^\""\n]+\""(?=[ \t]*(,|$))";
var matches = Regex.Matches (data, rx);
foreach (Match match in matches) {
data = new Regex (match.Value).
Replace(data, match.Value.Replace (',', ';'), 1);
}
Console.WriteLine (data);
会发出:
"efgh ;ijkl123;",abcd , "efgh ;ijkl123;",mnop456, "efgh ,ijkl123,"
上面的代码实际上是用,
半冒号的双引号替换所有;
个逗号。
答案 1 :(得分:0)
不确定它是否非常高效,但有效。欢迎建议进一步改进。
string input = "\"efgh ,ijkl123,\",abcd , \"efgh ,ijkl123,\",mnop456 \"efgh ,ijkl123,\"";;
Regex.Matches(input, "\"([^\"]*)\"(,)") // Extract string between quotes followed by ','.
.Cast<Match>()
.ToList()
.ForEach(m=> input = input.Replace(m.Value, m.Value.Replace(",",";")) // for each match replace with ';' inserted match.
.Replace(";\";",",\",")); // a hack, should have done it better
输出:
"efgh ;ijkl123,",abcd , "efgh ;ijkl123,",mnop456 "efgh ,ijkl123,"
工作Demo