匹配重复模式

时间:2012-11-02 21:00:35

标签: c# regex

我目前正在尝试匹配并捕获以下输入中的文字:

field: one two three field: "moo cow" field: +this

我可以将field:[a-z]*\:匹配但是我似乎无法匹配剩下的内容到目前为止我的尝试只会导致捕获所有不是我想要做的事情。

2 个答案:

答案 0 :(得分:2)

如果你知道它总是按字面意思field:,那么绝对不需要正则表达式:

var delimiters = new String[] {"field:"};
string[] values = input.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);

然而,从你的正则表达式我假设名称field可以变化,只要它在冒号前面。您可以尝试捕获一个单词,然后是:,然后是一切,直到下一个单词(使用前瞻)。

foreach(Match match in Regex.Matches(input, @"([a-z]+):((?:(?![a-z]+:).)*)"))
{
    string fieldName = match.Groups[1].Value;
    string value = match.Groups[2].Value;
}

正则表达式的解释:

(     # opens a capturing group; the content can later be accessed with Groups[1]
[a-z] # lower-case letter
+     # one or more of them
)     # end of capturing group
:     # a literal colon
(     # opens a capturing group; the content can later be accessed with Groups[2]
(?:   # opens a non-capturing group; just a necessary subpattern which we do not
      # need later any more
(?!   # negative lookahead; this will NOT match if the pattern inside matches
[a-z]+:
      # a word followed by a colon; just the same as we used at the beginning of
      # the regex
)     # end of negative lookahead (not that this does not consume any characters;
      # it LOOKS ahead)
.     # any character (except for line breaks)
)     # end of non-capturing group
*     # 0 or more of those
)     # end of capturing group

首先我们匹配anylowercaseword:。然后我们一次匹配另一个字符,每个字符检查此字符不是anotherlowercaseword:的开头。通过捕获组,我们稍后可以分别找到字段的名称和字段的值。

答案 1 :(得分:0)

不要忘记您实际上可以匹配正则表达式中的文字字符串。如果你的模式是这样的:

field\:

您将按字面意思匹配“field:”,而不是其他任何内容。