RegEx解析嵌套标签?

时间:2013-03-11 14:58:21

标签: c# .net regex

我有这样的文字:

This is {name1:value1}{name2:{name3:even dipper {name4:valu4} dipper} some inner text} text

我想解析这样的数据:

Name: name1
Value: value1

Name: name2
Value: {name3:even dipper {name4:valu4} dipper} some inner text

然后我会递归处理每个值来解析嵌套字段。 你能推荐一个RegEx表达式吗?

2 个答案:

答案 0 :(得分:3)

在C#中,您可以使用balancing groups来计算和平衡括号:

{ (?'name' \w+ ) :       # start of tag
(?'value'                # named capture
  (?>                    # don't backtrack
    (?:
      [^{}]+             # not brackets
    | (?'open' { )       # count opening bracket
    | (?'close-open' } ) # subtract closing bracket (matches only if open count > 0)
    )*
  )
  (?(open)(?!))          # make sure open is not > 0
)
}                        # end of tag

Example

string re = @"(?x)       # enable eXtended mode (comments/spaces ignored)
{ (?'name' \w+ ) :       # start of tag
(?'value'                # named capture
  (?>                    # don't backtrack
    (?:
      [^{}]+             # not brackets
    | (?'open' { )       # count opening bracket
    | (?'close-open' } ) # subtract closing bracket (matches only if open count > 0)
    )*
  )
  (?(open)(?!))          # make sure open is not > 0
)
}                        # end of tag
";

string str = @"This is {name1:value1}{name2:{name3:even dipper {name4:valu4} dipper} some inner text} text";

foreach (Match m in Regex.Matches(str, re))
{
    Console.WriteLine("name: {0}, value: {1}", m.Groups["name"], m.Groups["value"]);
}

输出:

name: name1, value: value1
name: name2, value: {name3:even dipper {name4:valu4} dipper} some inner text

答案 1 :(得分:2)

如果使用Perl / PHP / PCRE,它根本不复杂。您可以使用如下表达式:

{(\w+):         # start of tag
   ((?:
      [^{}]+    # not a tag
   |  (?R)      # a tag (recurse to match the whole regex)
   )*)
}               # end of tag