正则表达式匹配和分组

时间:2015-09-07 15:52:01

标签: regex

需要将以下字符串解析为组:

September 6, 2015 8:00 pm PDT<br />        Foobar1: This may be a string or just some text.<br />        Foo Bar 2: Some other text<br />        Foo_Bar3: This could be anyting<br />        foobar4: Who knows<br />        Foobar5: more text (text)<br />        Foo bar6: There could be more<br />

常量是换行符和冒号。这些将基本上是键/值,foobar是冒号前面的键,冒号右边的文本是值。将无法确定字符串中将包含多少个键/值组。

这是我能够得到的:

(?<Key>>.*:).*:.*(?<Value>:.*<)

但它匹配第一个&gt;到最后&lt ;,之间没有分组。我可以在代码中拉出
标签。提前谢谢你的期待。

1 个答案:

答案 0 :(得分:0)

如果在时间戳之后总是有一个新行,那么这应该有效:

(?<=<br\s/>)([^:]*?):(.*?)<br\s/>

为了让每组比赛都能得到这些组:

        string pattern = @"(?<=<br\s/>)(?'key'[^:]*?):(?'value'.*?)<br\s/>"; 
        string input = "September 6, 2015 8:00 pm PDT<br />        Foobar1: This may be a string or just some text.<br />        Foo Bar 2: Some other text<br />        Foo_Bar3: This could be anyting<br />        foobar4: Who knows<br />        Foobar5: more text (text)<br />        Foo bar6: There could be more<br />"; 
        MatchCollection matches = Regex.Matches(input, pattern); 
        for (int i = 0; i < matches.Count; i++) 
        {
            string key = matches[i].Groups["key"].ToString();
            string value = matches[i].Groups["value"].ToString();
        }