我有一些以下格式的字符串:
--> ABCDEF_(0) "Abcde fgh"
--> GHIJ4 1
第一个应该返回3个匹配:
-->
ABCDEF_(0)
"Abcde fgh"
第二个也应该返回3个匹配:
-->
GHIJ4
1
所以我想要匹配的是:
字符串中可能有更多类型为(2)和(3)的组,因此单个字符串可能只有3个匹配。
到目前为止,这就是我所拥有的:
var regex = new Regex(
@"-->" + // match the starting arrow
@"|[^""\s]*\S+[^""\s]*" + // match elements not surrounded by quotes, trimmed of surrounding whitespace
@"|""[^""]+"""); // match elements surrounded by quotes
但这不起作用,因为它会破坏引号中的表达式,返回第一个字符串:
-->
ABCDEF_(0)
"Abcde
fgh"
正则表达式有用吗?如果有比正则表达式更简单的方法,我也会接受它。
答案 0 :(得分:1)
使用捕获会更容易(我在这里使用了命名捕获):
var regex = new Regex(@"-->" // match the arrow
+ @"\s+(?<first>[^\s]+)" // capture the first part always unquoted
+ @"(\s+(?<second>(""[^""]+"")|[^\s]+))+"); // capture the second part, possibly quoted
var match = regex.Match("--> ABCDEF_(0) \"Abcde fgh\"");
Console.WriteLine(match.Groups["first"].Value);
Console.WriteLine(match.Groups["second"].Value);
match = regex.Match("--> GHIJ4 1");
Console.WriteLine(match.Groups["first"].Value);
Console.WriteLine(match.Groups["second"].Value);
match = regex.Match("--> GHIJ4 1 \"Test Something\" \"Another String With Spaces\" \"And yet another one\"");
Console.WriteLine(match.Groups["first"].Value);
Console.WriteLine("Total matches:" + match.Groups["second"].Captures.Count);
Console.WriteLine(match.Groups["second"].Captures[0].Value);
Console.WriteLine(match.Groups["second"].Captures[1].Value);
Console.WriteLine(match.Groups["second"].Captures[2].Value);
Console.WriteLine(match.Groups["second"].Captures[3].Value);
答案 1 :(得分:0)
感谢因某些原因而迅速删除的答案,我已成功解决了问题。
思路:
结果正则表达式:
Regex sWordMatch = new Regex(
@"""[^""]*""" + // groups of characters enclosed in quotes
@"|[^""\s]*\S+[^""\s]*", // groups of characters without whitespace not enclosed in quotes