Question

我想解析以下示例字符串

foo :6

分为两组：文字和数字。仅当字符“：”位于数字本身之前时，才应填充数字组。

这样：

foo 6 -> Text = "foo 6"
foo :6 -> Text = "foo", Number = "6"

到目前为止我能想出的最好的是

(?<Text>.+)(?=:(?<Number>\d+)h?)?

但这不起作用，因为第一组贪婪地扩展到整个字符串。

有什么建议吗？

Answer 1

如果真的想要使用正则表达式，你可以写一个非常简单的，没有外观：

(?<Text>[^:]+):?(?<Number>\d*)

在我看来，正则表达式应该尽可能简单;如果您不希望Text组周围有空格，我建议您使用match.Groups["Text"].Value.Strip()。

请注意，如果要解析多行字符串，则此模式将不起作用，因为正如下面提到的@OscarHermosilla，[?:]+也将匹配换行符。修复很简单，用[^:\n]

更改它

Answer 2

您不需要任何单独的功能来剥离尾随空格

以下正则表达式会将所有字符捕获到命名组Text中，但:\d+除外（即; :后跟一个或多个数字）。如果它找到冒号后跟数字，则它开始将数字捕获到命名组Number

中

^(?<Text>(?:(?!:\d+).)+(?=$|\s+:(?<Number>\d+)$))

DEMO

String input = "foo 6";
String input1 = "foo :6";
Regex rgx = new Regex(@"^(?<Text>(?:(?!:\d+).)+(?=$|\s+:(?<Number>\d+)$))");

foreach (Match m in rgx.Matches(input))
{
Console.WriteLine(m.Groups["Text"].Value);
}
foreach (Match m in rgx.Matches(input1))
{
Console.WriteLine(m.Groups["Text"].Value);
Console.WriteLine(m.Groups["Number"].Value);
}

输出：

foo 6
foo
6

IDEONE

Answer 3

您可以通过更改重复组名称文本。这样：

(?<Text>.+)\s+:(?<Number>\d)|(?<Text>.+)

DEMO

基于这篇文章背后的想法：Regex Pattern to Match, Excluding when... / Except between

Answer 4

您可以简单地使用split而不是regex：

"foo :6".Split(':');

Answer 5

您可以尝试：

(\D+)(?:\:(\d+))

或使用此模式执行Regex.Split：

(\s*\:\s*)

可选组的正则表达式

5 个答案: