如果逗号不在两个双引号之间,请用逗号分隔

时间:2012-12-17 14:17:54

标签: c# regex

我希望用逗号分隔这样的字符串:

 field1:"value1", field2:"value2", field3:"value3,value4"

string[]看起来像:

0     field1:"value1"
1     field2:"value2"
2     field3:"value3,value4"

我试图用Regex.Split来做这件事,但似乎无法解决正则表达式。

5 个答案:

答案 0 :(得分:7)

使用Matches比使用Split更容易做到这一点,例如

string[] asYouWanted = Regex.Matches(input, @"[A-Za-z0-9]+:"".*?""")
    .Cast<Match>()
    .Select(m => m.Value)
    .ToArray();

虽然如果你的值(或字段!)的任何机会包含转义引号(或任何类似的棘手),那么使用正确的CSV解析器可能会更好。


如果您已在您的值中转义了引号,我认为以下正则表达式 工作 - 给它一个测试:

@"field3:""value3\\"",value4""", @"[A-Za-z0-9]+:"".*?(?<=(?<!\\)(\\\\)*)"""

添加的(?<=(?<!\\)(\\\\)*)应该确保它停止匹配的"前面只有偶数个斜杠,因为奇数个斜线意味着它被转义。

答案 1 :(得分:1)

未经测试,但这应该是好的:

string[] parts = string.Split(new string[] { ",\"" }, StringSplitOptions.None);

记得在需要时添加“返回结尾。”

答案 2 :(得分:1)

string[] arr = str.Split(new string[] {"\","}}, StringSplitOptions.None).Select(str => str + "\"").ToArray();

在提到的webnoob时按\,拆分,然后使用select将后缀跟"后缀,然后转换为数组。

答案 3 :(得分:0)

试试这个

// (\w.+?):"(\w.+?)"        
//         
// Match the regular expression below and capture its match into backreference number 1 «(\w.+?)»        
//    Match a single character that is a “word character” (letters, digits, and underscores) «\w»        
//    Match any single character that is not a line break character «.+?»        
//       Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»        
// Match the characters “:"” literally «:"»        
// Match the regular expression below and capture its match into backreference number 2 «(\w.+?)»        
//    Match a single character that is a “word character” (letters, digits, and underscores) «\w»        
//    Match any single character that is not a line break character «.+?»        
//       Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»        
// Match the character “"” literally «"»        


try {        
    Regex regObj = new Regex(@"(\w.+?):""(\w.+?)""");        
    Match matchResults = regObj.Match(sourceString);        
    string[] arr = new string[match.Captures.Count];        
    int i = 0;        
    while (matchResults.Success) {        
        arr[i] = matchResults.Value;        
        matchResults = matchResults.NextMatch();        
        i++;        
    }         
} catch (ArgumentException ex) {        
    // Syntax error in the regular expression        
}

答案 4 :(得分:0)

最简单的内置方式是here。我把它弄了。它工作正常。它将"Hai,\"Hello,World\""拆分为{"Hai","Hello,World"}